The Step Between AI and Submission That Nobody Optimizes
Risk adjustment programs invest heavily in two areas: the AI that identifies and validates codes, and the coder workforce that processes charts. Between these two investments sits quality assurance, the step where coded output is reviewed for accuracy before submission to CMS. In most organizations, QA is the underfunded, understaffed, and underdesigned bottleneck that determines whether the investments in AI and coders actually produce defensible output.
A typical QA process works like this: after coders process a batch of charts, a QA team reviews a sample, usually 5% to 10% of completed work. They check whether codes match the documentation, whether evidence supports the HCC, and whether the coder followed protocol. Problems found in the sample trigger re-reviews. The rest of the batch, the 90% to 95% that wasn’t sampled, ships to CMS without QA review.
The OIG’s March 2026 audits found error rates between 81% and 91% across three plans. Those error rates reflect the quality of what was submitted, which means the quality of what passed through QA. If QA catches problems at a rate lower than the actual error rate in the coded output, defective codes pass through the QA filter and reach CMS. The filter itself becomes the weakest link.
Why Sample-Based QA Misses Systemic Problems
Sample-based QA is designed to catch random errors: a coder who misread a note, a one-off mapping mistake, an isolated documentation oversight. It’s not designed to catch systemic problems: an AI that consistently overvalues a particular category, a documentation pattern that satisfies the coder’s threshold but not the auditor’s, or a MEAT validation gap that applies to an entire class of diagnoses.
Systemic problems affect every chart they touch, not just the 5% that QA samples. If the AI misapplies MEAT criteria for vascular disease codes across 500 charts and QA reviews 25 of them, QA might catch the pattern, or it might not, depending on which 25 charts were sampled and whether the reviewer recognizes the systemic nature of the error. The other 475 charts ship with the same defect.
The solution isn’t reviewing 100% of charts manually. That eliminates the efficiency gains AI provides. The solution is AI-assisted QA that evaluates the full output for pattern-level problems before the sample-based human review begins.
AI-Augmented Quality Assurance
AI-augmented QA operates in two layers. The first layer runs automated checks across the full coded output. It evaluates every chart for MEAT completeness, evidence-code alignment, category-specific risk factors (e.g., single-occurrence acute conditions without current management documentation), and coding patterns that deviate from expected distributions. This automated layer catches systemic problems across the entire batch, not just the sampled subset.
The second layer is human review, focused on the cases the automated layer flagged. Instead of reviewing a random 5% sample, the QA team reviews every chart the AI identified as potentially problematic, plus a random sample for calibration. Human judgment applies where the AI identified uncertainty. The rest of the output, which passed automated validation, proceeds with documented evidence trails that demonstrate the validation was performed.
This approach produces two things traditional QA doesn’t: full-population systemic error detection and a documented QA trail for every chart, not just the sampled ones. When CMS audits a code, the plan can show not just the evidence trail for the coding decision but also the QA validation that confirmed the decision’s defensibility.
The Quality Investment That Compounds
QA is where AI quality and coder quality converge into submission quality. Plans that underinvest in QA undermine every dollar spent on better AI and better coders, because the output passes through a filter that doesn’t catch what it should. Retrospective Risk Adjustment Coding programs that redesign QA around AI-augmented full-population screening, targeted human review, and documented validation trails are closing the bottleneck that sits between good technology and defensible submissions. The investment compounds because every improvement in QA rigor improves the defensibility of every code the program produces.
