AI mastery in customer care: Raising the bar for quality assurance

Quality assurance (QA) in the contact center is still largely a manual process, prone to human bias and limited by its reliance on small, random samples of customer conversations to evaluate and improve agent performance. Now, much of this could be automated through generative AI (gen AI) resulting in cost savings, improvements in agent performance, and greater customer satisfaction.

From our early work in this space, we estimate that a largely automated QA process could achieve more than 90 percent accuracy—compared to 70 to 80 percent accuracy through manual scoring—and savings of more than 50 percent in QA costs.

Here we explore how gen-AI-enabled contact center analytics could transform QA, as seen at one financial services company. This blog forms part of our series on AI in customer service operations and builds on our article, Gen AI in customer care: Early successes and challenges.

The benefits of shifting from manual to automated QA

Currently, QA processes are predominantly performed manually and consume substantial labor resources per agent. Such a manual assessment method is often limited to only a small fraction of total conversations—less than 5 percent—with human bias potentially compromising the accuracy of overall quality evaluations.

Gen AI can now partially automate the quality analysis of live interactions, including calls, emails, and chats, and assess various aspects of agent performance, such as call flow, verification, procedural adherence, authentication, problem understanding, resolution, and soft skills.

A financial services company that recently deployed gen AI in its QA process achieved more than 90 percent accuracy across key quality parameters. It also identified initiatives that could improve customer experience by five percentage points and save 25 to 30 percent on contact center costs through enhanced agent performance and QA efficiency.

Improved agent training is another potential positive of deploying gen AI for QA. Integrating additional datasets such as standard operating procedures (SOPs), scripts, and agent behavior metrics into the QA process could return even more comprehensive results—creating tailor-made coaching programs for each agent, for example.

Overall, we estimate that gen AI has the potential to yield more than 50 percent savings in QA costs, a 25 to 30 percent increase in agent efficiency, and a 5 to 10 percent improvement in customer satisfaction.

Mitigating risk through thoughtful design

Naturally, as with any application of gen AI, deployment plans need to consider certain risk factors, including gen AI’s limited ability to verify “unseen data” used by agents, and “model hallucination,” where gen AI might generate misleading outputs unrelated to the client’s context. Bias is another well-known gen AI risk that can be mitigated by leveraging technology such as model embedding.

Ensuring agents adhere to the correct process can also be a challenge with gen AI. However, technology developments, as well as thoughtful design and controls, can mitigate these risks.

Other important considerations when rolling out this technology include stringent data anonymization protocols to protect personal information, especially in sectors such as healthcare and finance, and how to manage complex and varied call intents and QA parameters across different clients and industries. These concerns can be addressed with customized prompt libraries for each question, or more complex models for questions with multiple scoring criteria.

Harnessing the benefits of gen AI in QA: A six-step approach

Companies that are ready to explore the benefits of gen AI in QA automation can follow six core steps to set up a pilot and build a business case:

  • Evaluate the current QA process. Understand the manual QA process and evaluate the questions and scoring criteria.
  • Assess data quality. Test the quality of the input data, which could include call transcripts, chats, and emails.
  • Develop questions. Distinguish between questions that can be scored through contact or interaction data only and those that require the overlay of additional datasets outside of what gen AI “sees.”
  • Create prompts. Develop and refine large language model (LLM) prompts, with a detailed context for scoring.
  • Define validation. Validate gen AI scores against a validation dataset.
  • Continue to refine. Refine and synthesize results in an easy-to-use format to be used by supervisors and agents.

Gen AI is already transforming other processes in the contact center, and its value for QA is becoming ever-more apparent, too.

As organizations look to use gen AI to drive efficiency and improve customer experience, a mindful application of contact analytics, with appropriate safeguards and expert inputs, could significantly enhance the overall quality assurance process and deliver advantages for early adopters.

Be part of the conversation with McKinsey Talks Operations. Learn more and register for events with our Operations experts here.

Connect with our Operations Practice