2026 Edition

The Support QA Playbook

Score Every Conversation. Keep Every Customer.

15 min read Support Leaders, QA Managers, Team Leads Updated January 2026

Support QA is hard to get right. If you've tried it before and it didn't stick, you're not alone.
{.text-lg .text-neutral-600 .mb-8}

Most QA programs fail. The rubric gets built for auditors, not for the people who actually need to use it: agents, team leads, and the customers who never see it but feel the results.

You've probably seen this pattern: Someone builds a scorecard with 15 categories and 47 sub-questions. Reviewers spend 30 minutes per ticket. Scores go into a spreadsheet. Nothing changes. Within three months, QA becomes a checkbox exercise everyone resents, and the spreadsheet stops getting updated.

Or maybe your QA program does run, but it feels disconnected from reality. Agents with high scores somehow have worse customer satisfaction. The rubric rewards following scripts instead of solving problems. You're measuring something, but you're not sure it's the right thing.

This playbook takes a different approach. It's built around one question: Did we solve the problem and keep the customer?

Everything else only matters if it serves that goal: the tone, the greeting, the script adherence—all of it. Look, I know you've seen QA programs die before. Give this a read anyway. By the end, you'll have a rubric you can actually use, guidance on the stuff that makes it stick, and a clear path to getting started this week.

Why Most QA Programs Fail #

Before we get to the rubric, it's worth understanding why QA fails. You already know it's hard. But understanding the specific failure modes helps you avoid them.

Measuring what's easy instead of what matters #

Process metrics are seductive because they're objective. "Did the agent use the customer's name?" has a clear answer. "Did the customer leave feeling valued?" requires judgment. So rubrics drift toward what's easy to measure: greetings, sign-offs, script compliance, hold time.

Here's the thing: customers want their problem solved. The greeting is background noise. An agent who says "Thanks for calling, how can I provide you with excellent service today?" but fails to fix the issue is worse than an agent who says "Hey, what's up?" and resolves it in two minutes.

What works: Weight outcomes heavily. This rubric uses 60% outcomes, 40% process. Some organisations go as high as 85/15. (I've watched teams argue about the exact ratio for hours. Start with 60/40 and adjust based on what you learn.) The right ratio depends on your context, but if your rubric weights process and outcomes equally, you're over-weighting process.

Calibration isn't optional #

Here's what happens without it: your 4 is someone else's 3. I've seen teams where scores varied by a full point depending on who reviewed. At that point you're measuring reviewer mood, not agent performance.

The fix is simple but people skip it anyway. Monthly calibration sessions where reviewers score the same conversations independently, then compare. Target 80% agreement within one point. (Honestly, 70% is fine when you're starting—the point is you're converging.) The disagreements are the whole point. That's where you figure out what the rubric actually means.

Scores without action are just data entry #

If you're not coaching off the scores, stop scoring. You're wasting everyone's time.


Enough about failure. Here's the rubric.

Get the full rubric

One email. Full rubric. No nurture sequence.

We'll send you occasional content like this. Unsubscribe anytime.

What if you could QA 100% of conversations?

Hay scores every conversation automatically and surfaces the ones that need human attention.

Start your pilot