The ROI of AI claims investigation: 3 carrier case studies

$308B

Annual US insurance fraud loss

CAIF / Colorado State 2022

~10%

P&C claims involving fraud

NICB / III

14+ days

Manual SIU baseline per case

Hesper internal benchmarks

2-4 hours

AI investigation per case

Hesper internal benchmarks

Carriers asking for ROI on AI claims investigation usually get a vendor pitch deck. Here is the math without one. We model three carrier scenarios rather than name customers, because the published industry benchmarks are strong enough to do honest math, and because public case studies of AI investigation deployments at named US P&C carriers are still rare. Where named outcomes appear in this post (AXA Switzerland with Shift Technology, Tokio Marine with Tractable), they are clearly attributed to those vendors and used as evidence the category produces measurable returns.

The thesis: AI claims investigation pays back inside year one for any P&C carrier with more than 25,000 annual claims, and the payback math is dominated by coverage uplift, not labor displacement. The Coalition Against Insurance Fraud puts the annual US fraud cost at $308.6 billion, and SIU staffing has not kept pace with claim volume, which is why coverage gaps now drive most of the loss.

We size three tiers, attach industry baselines to each, and then layer Hesper internal benchmarks on top to produce payback periods and three-year cumulative savings. For the buyer-side framework that pairs with this analysis, see our complete guide to autonomous AI claims investigation.

The ROI question every claims VP is being asked

Finance teams are not asking whether AI claims investigation works. They are asking why now, at what scale, and on what payback horizon. The pressure is structural. Fraud losses keep climbing while SIU headcount does not. The Insurance Information Institute, citing the 2022 CAIF SIU Benchmarking Study, reports SIU staff growth of 1.4% year over year, well below the trend of prior years and far below the growth in flagged-claim volume.

The result is a coverage gap. Manual SIU teams investigate roughly 25% of flagged claims, per Hesper internal benchmarks, because per-investigator throughput sits near 10 closed cases per month against caseloads of 200+. The other 75% close out without a defensible investigation record. That is the variable AI investigation moves, and it is the variable finance teams should be modeling.

Regulatory pressure is the second forcing function. The NAIC Insurance Fraud Prevention Model Act #680 has been adopted, in whole or part, across most US states, requiring carriers to maintain antifraud plans and dedicated SIU functions. State regulations go further. California's 10 CCR 2698.36 requires the SIU to investigate each credible referral, including system-generated ones. Under that standard, a 25% coverage rate is hard to defend at exam time. ROI models that ignore exam exposure understate the value of AI investigation by a meaningful margin.

Methodology: how we built these models

Each scenario uses three inputs: annual claims volume, fraud incidence (10% of claims, per NICB and III for P&C), and current SIU throughput. We then apply Hesper internal benchmarks for AI investigation cost and coverage to compute the delta. We do not include deterrence effects, brand-trust uplift, or any soft dollars in the headline number. Those belong in a sensitivity case, not the base case.

The fully-loaded SIU FTE cost we use is approximately $170,000 per year. That figure derives from a $96,000 base salary baseline (referenced in CAIF SIU benchmarking work and cited via the III page above) plus roughly 75% loading for benefits, tools, and overhead. Treat it as a derivation, not a direct citation. For a deeper look at how to compare per-investigator productivity assumptions across vendors, see our 2026 SIU performance benchmarks.

We anchor each scenario against publicly disclosed vendor outcomes where they exist. AXA Switzerland's published work with Shift Technology covered more than 1 million claims and stopped over €12 million in fraud. Shift has separately published a 4x year-one ROI benchmark for fraud detection deployments. Those are detection-side numbers. The investigation-side ROI we model below compounds on top of detection ROI rather than substituting for it.

Input	Value used	Source
Fraud incidence (P&C)	10% of claims	NICB / III
Manual SIU coverage	25% of flagged claims	Hesper internal benchmarks
AI investigation coverage	100% of flagged claims	Hesper internal benchmarks
Manual cost per case	~$2,500	Hesper internal benchmarks
AI cost per case	~$150	Hesper internal benchmarks
Manual throughput	~10 cases / investigator / month	Hesper internal benchmarks
AI throughput	800+ cases / investigator / month	Hesper internal benchmarks
Fully-loaded SIU FTE	~$170,000 / year	Derived from $96K base + 75% loading
Confirmed-fraud rate on investigated	35-50%	CAIF SIU Benchmarking via III

Coverage uplift, not labor displacement

The single most common modeling mistake is treating AI investigation as a labor-substitution play. The dominant ROI driver is the additional 75% of flagged claims that move from un-investigated to investigated. The labor savings are real but secondary. Models that lead with FTE reduction will systematically understate the recovery side.

Scenario 1: small regional carrier

Profile: 25,000 annual claims, 8 SIU FTEs, regional footprint with one or two state filings. Fraud-flagged volume sits near 2,500 cases per year. At ~10 closed cases per investigator per month, the SIU has roughly 960 investigations of annual capacity, so manual coverage lands near 38% of flagged volume. That is more generous than the industry average, but still leaves 1,500+ flagged claims closing without investigation each year.

AI investigation lifts coverage to 100% at a per-case cost near $150. The same 8 FTEs shift into review and decision-making over AI-produced evidence packages, which is why throughput per investigator can move from ~10 to 800+ per month. The carrier does not reduce headcount in this model. It reallocates it.

Line item	Manual baseline	With AI investigation
Annual claims	25,000	25,000
Flagged for SIU (10%)	2,500	2,500
Investigated	~960 (38%)	2,500 (100%)
Cost per investigation	~$2,500	~$150
Annual investigation cost	~$2.4M	~$375K
Additional investigated cases	0	1,540
Year-1 net savings (labor + coverage recovery, conservative)	-	$3-6M
Estimated payback period	-	6-9 months

At this tier, the case for AI investigation is less about FTE displacement and more about coverage. Small SIUs cannot investigate everything that flags, and the missed flags are exactly where organized rings target. Regulatory exposure under NAIC Model Act 680 applies regardless of size, so the exam-posture argument is as strong here as at a national carrier.

Scenario 2: mid-market multi-state carrier

Profile: 150,000 annual claims, 35 SIU FTEs, multi-state footprint. Flagged volume is 15,000 per year. Manual capacity near 4,200 closed cases per year covers about 28% of flags. The other 10,800 flagged claims close without investigation. This is the tier where the absolute dollar uplift from AI investigation is largest, because the un-investigated volume is large in absolute terms and average exposure per case is meaningful.

It is also the tier where AXA Switzerland's published Shift Technology results sit, in carrier-size terms. Their team analyzed more than 1 million claims and stopped €12M+ in fraud on the detection side alone. Investigation-side returns layer on top of that detection envelope, because the marginal cost of investigating a flag drops from ~$2,500 to ~$150 once AI is doing the evidence assembly.

Line item	Manual baseline	With AI investigation
Annual claims	150,000	150,000
Flagged for SIU (10%)	15,000	15,000
Investigated	~4,200 (28%)	15,000 (100%)
Cost per investigation	~$2,500	~$150
Annual investigation cost	~$10.5M	~$2.25M
Additional investigated cases	0	10,800
Year-1 net savings range	-	$25-60M
Estimated payback period	-	4-7 months

Cost per investigation: manual vs AI (Scenario 2 inputs)

Manual SIU per case~$2,500

AI investigation per case~$150

Mid-market carriers also tend to be where statute-of-limitations decay erodes recovery. Cases that sit in queue for months past SIU intake lose subrogation and recovery options. Compressing investigation from 14+ days to 2-4 hours, per Hesper internal benchmarks, recovers exposure that would otherwise expire. That is not in the table above, and it should be in your sensitivity case.

“Working with Shift Technology, AXA Switzerland analyzed more than 1 million claims and stopped over €12 million in fraud, demonstrating measurable financial returns from AI applied at the detection layer.”
- Shift Technology / AXA Switzerland published case study

Scenario 3: Top-25 national carrier

Profile: 1 million-plus annual claims, 200+ SIU FTEs, national footprint, public scrutiny. Per the NAIC 2024 P&C Market Share Report, the top-25 P&C carriers account for roughly two-thirds of US direct premiums written. At this tier, every 1% of investigation coverage is worth tens of millions in exposure.

Flagged volume sits at 100,000 cases per year. Manual capacity, even at 200 FTEs, covers around 24,000 cases or about 24% of flags. The other 76,000 close without investigation. Hard-dollar ROI math at this scale is straightforward, but the regulatory ROI layer is where the model really pulls away. Under California 10 CCR 2698.36, an SIU must investigate each credible referral. A carrier that misses 76,000 flags per year cannot credibly attest to that standard during a market conduct exam.

Line item	Manual baseline	With AI investigation
Annual claims	1,000,000	1,000,000
Flagged for SIU (10%)	100,000	100,000
Investigated	~24,000 (24%)	100,000 (100%)
Cost per investigation	~$2,500	~$150
Annual investigation cost	~$60M	~$15M
Additional investigated cases	0	76,000
Year-1 net savings range	-	$150-400M
Estimated payback period	-	3-5 months

Tier-1 international carriers have already moved on AI when the math has cleared. Tokio Marine deployed Tractable for auto-damage assessment in April 2020, compressing decisions from 2-3 weeks to minutes. The investigation-side compression we model here (14+ days to 2-4 hours) sits in the same category. The buyer profile is also similar: a large carrier with a defined queue problem, public scrutiny, and a finance team that can read a payback table.

What ROI modeling misses

The hard-dollar model above understates value in three places, and these are usually the largest line items by year three.

Regulatory examination posture

Under California 10 CCR 2698.36 and the NAIC Model Act 680 antifraud plan standard, a 25% investigation coverage rate is increasingly indefensible. Remediation costs from a market conduct exam (consent orders, mandated process changes, reporting overhead, in some cases fines) are rarely modeled in vendor business cases. They should be. A 100% investigation coverage record is not just an operational metric. It is the answer to the question the examiner will ask first.

Reserve accuracy

Reserves set on partial information move when investigation completes. Compressing investigation from 14+ days to 2-4 hours means reserves get tightened earlier in the claim lifecycle, which feeds directly into combined ratio. We do not include reserve adjustment savings in the headline numbers above because they vary by line of business, but for long-tail lines like workers' comp and commercial liability, the reserve effect can rival the labor savings.

Investigator retention

SIU teams lose people to queue burnout. The work is high-judgment in nature but spends most of the day on document gathering and report drafting. Reframing the role from execution to decision-making reduces turnover and recruiting costs, and improves the quality of the investigations that matter most. Recruiting an experienced certified fraud examiner runs into six figures of fully-loaded cost. A model that ignores retention misses that line item entirely.

There is also a cost-side mirror to this analysis. The integration architecture you choose determines how much of the headline ROI survives contact with the carrier IT estate. We cover that in detail in our review of hidden integration costs in legacy claims AI, and the diligence questions that surface them sit in our vendor evaluation checklist for SIU leaders.

The forward-looking implication is operational, not financial. Once 100% investigation coverage becomes available at $150 per case, it moves from competitive advantage to baseline expectation. Carriers that hold 25% coverage in 2027 will be answering exam questions their peers will not, and writing checks on leakage their peers will have closed.

Key takeaways

AI claims investigation ROI is dominated by coverage uplift, not labor displacement, because moving from 25% to 100% investigation coverage is the largest single variable in the model.
Small regional carriers (25,000 claims/year) typically see payback in 6-9 months because the coverage gap is the primary driver and the platform cost is small relative to the recovery delta.
Mid-market carriers (150,000 claims/year) see the largest absolute dollar uplift; AXA Switzerland's published €12M+ result with Shift Technology sits in this tier on the detection side.
Top-25 national carriers face regulatory ROI on top of financial ROI, because NAIC Model Act 680 and state rules like California 10 CCR 2698.36 make a 25% investigation coverage rate hard to defend.
The ROI elements most often missed (examination posture, reserve accuracy, investigator retention) are typically larger by year three than the headline labor savings and should sit in the sensitivity case.

Frequently asked questions

Payback typically lands inside 6-9 months for any P&C carrier with more than 25,000 annual claims and a documented manual SIU baseline. The math is dominated by two variables: cost-per-case delta (manual SIU runs ~$2,500 fully loaded vs ~$150 for AI investigation, per Hesper internal benchmarks) and coverage uplift (manual SIU coverage is around 25% of flagged claims, AI hits 100%). The recovery delta from investigating the previously un-touched 75% almost always exceeds platform cost in year one. Carriers under 10,000 claims per year should expect 12-18 month payback because fixed software costs become a larger fraction of total spend. Shift Technology has published a 4x year-one ROI benchmark for fraud detection alone, and investigation-side ROI compounds on top of that.

Three inputs. First, baseline fraud-flagged volume equals annual claims times incidence rate (10% per NICB and III for P&C). Second, current investigated share - in most carriers, manual SIU touches roughly 25% of flags because of throughput limits (200+ cases per investigator, ~10 closed per month). Third, average dollar value per investigated case, typically the line-of-business average paid claim. AI investigation lifts the investigated share to 100%. Multiply the additional 75% by average exposure and a confirmed-fraud rate (CAIF benchmarks 35-50% of investigated cases produce a denial, deferral, or recovery), then discount by your collection efficiency. Most mid-market carriers find $30M-$80M in previously un-touched annual exposure once they run the math.

Yes, but the value math shifts. Under 100,000 claims per year, the case for AI investigation is less about labor displacement and more about coverage. Small SIUs cannot investigate everything that flags, and the missed claims are where organized rings target. A regional carrier with 25,000 annual claims, ~2,500 fraud flags, and 8 SIU FTEs can investigate about 1,000 cases manually per year. AI investigation closes that 1,500-case gap. ROI per dollar invested is often higher at this tier than at Top-25 carriers because the labor cost basis is leaner and every additional investigated case is incremental. Regulatory pressure under NAIC Model Act 680 applies regardless of carrier size, so the exam-posture argument carries equal weight.

Different layer of the stack. FRISS, Shift Technology, and Verisk are detection platforms. They score and flag claims for SIU review. Their published ROI (for example AXA Switzerland's €12M+ in fraud stopped with Shift across 1 million-plus claims) is real but measures detection accuracy. Hesper sits downstream and addresses the bottleneck those platforms expose: SIUs cannot investigate everything detection flags. The two ROI streams compound rather than substitute. A carrier already running a detection platform and seeing 4x year-one ROI on detection can layer investigation and capture additional returns on the previously un-investigated flag volume, because the marginal cost of investigating a flag drops from ~$2,500 to ~$150 per case.

Track four metrics monthly. First, investigation coverage rate (flagged claims actually investigated divided by total flagged); manual baseline is around 25%, AI target is 100%. Second, average time-to-disposition; manual baseline is 14+ days, AI target is 2-4 hours per Hesper internal benchmarks. Third, cost per investigation, fully loaded with investigator salary, tools, and overhead. Fourth, confirmed-fraud dollar recovery per investigated case. The ROI denominator is platform spend; the numerator is the marginal recovery from the now-investigated 75%, plus statute-of-limitations recoveries you were not capturing, plus reserve adjustment savings. Keep deterrence effects out of the headline number; they belong in a separate sensitivity case where assumptions are explicit.

Three. First, regulatory examination posture. Under California 10 CCR 2698.36 and the NAIC Model Act 680 antifraud plan standard, a 25% investigation coverage rate is increasingly hard to defend at exam time, and remediation costs are rarely in vendor ROI models. Second, reserve accuracy. Earlier investigation tightens claim reserves, which feeds combined ratio, particularly on long-tail lines like workers' comp. Third, investigator retention. SIUs lose people to queue burnout; reframing the role from execution to decision-making reduces turnover and recruiting costs. None of these show up in the typical spreadsheet, but they are often the largest line items by year three. Build a sensitivity case that prices each one before you sign.

Production deployment for a single line of business typically runs 30-60 days, with parallel operation against the existing SIU workflow for another 30-60 days for confidence-building. Full ROI visibility appears in months 4-6, once the AI is investigating live volume and recoveries are working through collections. Carriers running on modern claims platforms (Guidewire ClaimCenter, Duck Creek, Snapsheet) deploy faster than carriers on bespoke mainframes. Integration cost is the swing variable, and we cover the categories of hidden integration cost (services markup, customisation, training, integration latency, opportunity cost, and lock-in) in detail in our companion piece on legacy claims AI integration. Run the integration diligence in parallel with the ROI model, not after.