When AI "Fails" at Customer Support: What Are We Actually Learning?
This article is adapted from earlier academic work and has been edited for a general audience.
By now, the Klarna story is familiar. In early 2024, the Swedish fintech announced that its AI assistant, built with OpenAI, was handling two-thirds of customer service chats and performing the equivalent work of 700 agents. The company froze hiring and its workforce shrank significantly over the following year. Then, in mid-2025, CEO Sebastian Siemiatkowski publicly acknowledged the approach had gone too far. Quality had dropped. Customers were unhappy. Klarna began rehiring.
The story has been widely read as a vindication: AI tried to replace human support and it could not deliver. People are still needed. But is that really what happened here? And does it tell us what we think it tells us about the future of AI in customer service?
Beyond the replacement headline
When organisations adopt AI in customer-facing roles, the conversation tends to settle into a familiar binary. Either AI replaces people or it does not. Either the technology succeeds or it fails. Klarna's reversal has been placed firmly in the "failure" column, cited as proof that human connection remains irreplaceable.
But the evidence is more ambiguous than that framing suggests. Research on AI and labour markets consistently finds that these technologies rarely eliminate entire roles. What they do is restructure the task composition of work, automating certain activities while leaving others untouched. Eloundou and colleagues estimate that large language models can accelerate between 15% and 56% of workplace tasks depending on how deeply they are integrated into workflows. Brynjolfsson, Li and Raymond find that in customer service specifically, AI-driven productivity gains are greatest for workers handling moderately complex problems, not the routine queries that are easiest to automate.
What Klarna appeared to do was collapse this distinction. It treated the automation of high-volume, repetitive enquiries as though it meant the entire role could be removed. The AI could process routine requests at speed and scale. But the roles that were cut had also carried a wider set of responsibilities: navigating ambiguity, exercising judgement in complex cases, maintaining the relational dimension of customer interactions. Those capabilities did not transfer automatically to the system that replaced them.
A question of intent
One reading of Klarna's difficulties is that AI simply cannot handle the nuance of human interaction. That may be partly true today. But there is another possibility that receives less attention.
Siemiatkowski acknowledged that cost had been the predominant factor in how the company organised its support. The AI was measured on resolution speed, volume, and cost per interaction. It was not designed to optimise for whether customers felt understood, whether complex financial situations were handled with care, or whether trust was maintained across the relationship.
In other words, the system did what it was built to do. It optimised for the objective it was given. Whether a different objective would have produced a different outcome is a question worth sitting with, because it shifts the discussion from what AI is capable of to what we choose to ask it to do.
Research by Wang and colleagues, examining AI-enabled conversational agents as "digital employees" in frontline service, found that when these systems are designed to redistribute work rather than simply replace it, they can enhance professional identity and job control among human workers. The design matters. The intent matters. And the assumption that AI will always fall short of human interaction in these roles may say more about current implementation choices than about any permanent limitation of the technology.
That is not a comfortable conclusion. It does not settle the question in either direction. But it does suggest that treating Klarna as a simple story of technological failure risks overlooking what actually went wrong, and what might go differently next time.
What the restructuring reveals
Perhaps the most telling part of the Klarna story is not the reversal itself, but what the company rebuilt in its place. Klarna did not recreate its previous support operation. Instead, it introduced what Siemiatkowski described as an "Uber-type setup," recruiting students, rural workers, and brand enthusiasts as remote freelance agents with flexible schedules.
This is a different kind of restructuring. The AI did not simply fail and leave things as they were. It changed the conditions under which human workers re-entered the picture. The jobs that came back were not the same jobs that had been removed. They were more precarious, more platform-mediated, and organised around a logic of scalable flexibility rather than stable employment.
Research on algorithmic management highlights a broader version of this pattern. Jarrahi and colleagues describe how machine-learning systems increasingly perform managerial functions such as task allocation, monitoring, and performance evaluation. Spencer characterises this as a form of digital Taylorism, where knowledge that once resided with experienced workers becomes embedded in technology and returned to them as predefined instructions. The worker remains, but the nature of the work shifts.
This dynamic extends well beyond Klarna. Across the customer support sector, AI is reshaping not only whether humans are involved but how their involvement is structured, what skills are valued, and what autonomy they retain. The question of human presence in customer service may ultimately matter less than the question of what that presence looks like.
The skills that erode quietly
There is a further consequence of AI-driven restructuring that tends to surface only after the fact. When Klarna's support roles were removed, the company reportedly asked engineers, designers, and marketing staff to help handle customer enquiries. This revealed something that had not been visible before: the knowledge and judgement carried by customer support workers was distributed through the organisation in ways that no one had fully mapped.
Customer support is often treated as a cost centre, a function whose value is measured in tickets resolved and handling times reduced. But experienced support workers accumulate something harder to quantify: pattern recognition from thousands of edge cases, an understanding of how customers actually behave versus how systems assume they behave, and the ability to identify problems that have not yet been formally categorised. Brynjolfsson, Li and Raymond note that customer service workers increasingly contribute training data through their interactions, meaning that their experiential knowledge feeds directly into the AI systems that may eventually reshape or replace their roles.
When those roles disappear quickly, that layer of organisational intelligence can erode before anyone recognises its value. And once eroded, it is not easily rebuilt, particularly when the roles that return are structured around flexibility and throughput rather than depth and continuity.
An unsettled question
The Klarna story has been absorbed into a broader narrative about the limits of AI. In this narrative, the technology overreached, humans proved essential, and the natural order reasserted itself. It is a satisfying arc. It is also incomplete.
What Klarna's experience reveals is not a definitive answer about whether AI can or cannot handle customer support. It reveals how much depends on the choices surrounding adoption: what objective the system is given, how work is reorganised around it, what is preserved and what is allowed to erode, and on whose terms human workers are brought back into the picture.
AI in customer support is not replacement or augmentation in any stable sense. It is an ongoing negotiation between efficiency and quality, between cost structures and customer trust, between what technology can do today and what organisations choose to ask of it. The outcomes of that negotiation are not fixed. They depend on decisions that are still being made, by companies, by regulators, and by the workers whose roles are being reshaped.
There is nothing inevitable about where this goes. That is precisely why it matters.
References
Brynjolfsson, E., Li, D. and Raymond, L. (2025) 'Generative AI at work', Quarterly Journal of Economics. https://doi.org/10.1093/qje/qjae044
Eloundou, T., Manning, S., Mishkin, P. and Rock, D. (2023) GPTs are GPTs: An early look at the labor market impact potential of large language models. https://arxiv.org/abs/2303.10130
Jarrahi, M.H., Newlands, G., Lee, M.K., Wolf, C.T., Kinder, E. and Sutherland, W. (2021) 'Algorithmic management in a work context', Big Data and Society, 8(2). https://doi.org/10.1177/20539517211020332
Spencer, D.A. (2017) 'Work in and beyond the Second Machine Age: The politics of production and digital technologies', Work, Employment and Society, 31(1), pp. 142–152. https://doi.org/10.1177/0950017016645716
Wang, Y. et al. (2025) 'AI-enabled conversational agents as digital employees in frontline service', Information and Management. https://doi.org/10.1016/j.im.2025.104099