5 min read
Feb 3, 2026
Chart Review: The Shift from Volume to Value
Some charts matter a lot.
Most don’t.
That imbalance has always been true. What’s changed is our ability to do something about it.
Health plans rely on chart review because they have to. Risk adjustment, payment integrity, quality reporting — all of it eventually comes back to the chart.
For years, chart review has forced a choice:
Review broadly and accept high cost
Or narrow review and accept higher risk
Given those options, most organizations chose cost. That was a rational decision in a world where tooling was limited and visibility was poor.
What’s different now is that this is no longer the only choice on the table.
AI has changed what is possible — and what is reasonable to accept — in chart-based workflows. Continuing to operate the same way is no longer just inefficient. It increasingly introduces unnecessary cost, risk, and operational drag.
The question is no longer whether AI belongs in these workflows.
The question is how to use it responsibly.
Responsible AI Assisted Chart Review
The goal is simple:
Spend human time on the charts that actually influence outcomes, without giving up coverage, auditability, or control.
This is not about automating chart review or taking humans out of the process. It’s about being more deliberate about where human judgment is applied.
Example Use Case: Risk Adjustment
Risk adjustment makes the economics of AI-Assisted Chart Review easy to see, but the same structure applies to many chart-based workflows.
Step 1: Conservative Filtering
The first step is not to decide what’s “right” or “wrong.”
It’s to reduce volume.
AI is applied across the full chart population to answer a narrow question:
Is there any reasonable chance this chart could affect a member’s risk score?
Charts that are extremely unlikely to matter are set aside. Anything that might matter stays in.
This step alone often removes a meaningful share of charts from downstream review, without increasing risk, because most charts simply don’t contain actionable value.
This kind of conservative filtering wasn’t reliable before. Now it is.
Step 2: Look more closely where it counts
Once the population of charts is smaller, a more precise model can be applied to what remains.
In risk adjustment, this typically surfaces:
Suggested net new or upgraded HCCs
Gaps or inconsistencies in documentation
Clinical evidence that could affect submission decisions
At this stage, the system is making recommendations, not final calls.
Each recommendation is paired with the underlying evidence and a clear explanation. If a reviewer can’t understand how a conclusion was reached, their evaluation will be time-consuming and often unreliable.
Step 3: Keep people responsible for decisions
Every recommendation is reviewed by a human.
This preserves accountability, ensures decisions can be defended, and maintains a clear audit trail from evidence to submission. The AI provides suggestions; people decide.
Just as importantly, disagreement is signal. When a reviewer overrides a recommendation, that decision doesn't just disappear into a spreadsheet—it’s captured. Over time, these overrides make it clear where the system is reliable, where human judgment adds the most value, and where additional safeguards are needed.
Then there is the remainder: charts that the conservative filter kept, but the second precision model didn't flag. They represent the "gray area."
Initially, those charts are reviewed as well.
This is intentional. Early on, teams need to answer hard questions:
Are meaningful charts being missed?
Where does uncertainty cluster?
Which types of misses actually create operational or audit risk?
Only once those questions are answered does it make sense to reduce review of the “gray area” charts; moving from full review, to targeted review, to sampling. Review volume decreases because uncertainty decreases, not because controls are removed.
What changes in practice
In retrospective risk adjustment, teams using this approach typically see:
A small fraction of charts requiring human review
Reviewer time spent on charts that affect outcomes
More consistent decisions
Clearer audit trails than fully manual workflows
Just as importantly, programs become easier to scale. When the full population is reviewed systematically and humans focus on decisions that matter, growth costs become more predictable.
A Common Pattern in Chart Review
The same logic applies in:
DRG payment integrity, where only a subset of claims affect reimbursement
HEDIS abstraction, where value is concentrated in charts that close or fail measures
Other workflows where review cost is flat but value is uneven
The question is the same in every case: where does human effort actually change the outcome?
Making the decisions around where human judgement is needed and where it is not is where most of the economic and operational leverage lives.
Closing Thoughts
The goal is not to automate chart review; it is to systematically align human effort with impact.
Many AI initiatives fail because they try to skip the hard work of building trust. When conservative filtering is skipped, steps are collapsed, or oversight is removed too early, efficiency gains are quickly offset by lost value and weaker governance. You save time, but you lose value.
Done correctly, the model is simple: AI reviews the full population, while humans own the decisions that matter. Over time, confidence replaces uncertainty, review volumes deliberately decrease, and the program scales without eroding quality.
The organizations that succeed in this next era will not be the ones that "add AI" to a legacy process. They will be the ones that architect integrated workflows, and combine modern AI capabilities with expert judgment.
If this framing resonates, the most important question is not whether to use AI or humans, but how to divide the work between them. I’m always interested in comparing notes on how different organizations approach that problem.







