The CIO checklist for an AI claims investigation rollout: security, integrations, MSAs

5 gates

Security, integrations, rollout, contracts, governance

What a CIO confirms before signature

5 criteria

SOC 2 Trust Services Criteria

AICPA: Security, Availability, Processing Integrity, Confidentiality, Privacy

20+ states

Adopting the NAIC AI Model Bulletin

Carrier is accountable for vendor AI

$308B

Annual US insurance fraud loss

Coalition Against Insurance Fraud

The CIO inherits the decision last and can still veto it

By the time an AI claims investigation platform reaches the CIO, the SIU director has championed it and finance has signed off on the unit economics, and the CIO's job is to confirm one thing: that the platform is safe to integrate, contractable, and governable inside the carrier's stack. That review is fast - days, not weeks - but it is not a rubber stamp. A single unresolved data-flow diagram, a missing SOC 2 report, or a contract clause that reserves the right to train on carrier claim data is a hard stop, regardless of how strong the loss-ratio case looks upstream.

The reason the decision matters is the loss exposure behind it. Roughly 10% of property-casualty claims involve fraud, and the Coalition Against Insurance Fraud estimates insurance fraud steals at least $308 billion every year from American consumers. The platform under review exists to close the coverage gap on that exposure: it takes manual SIU investigation from 14+ days per case to 2-4 hours and lifts coverage of flagged claims from roughly 25% toward 100%. That is the outcome the CIO is being asked to enable. The CIO does not own that number - the Claims VP does - but the CIO owns whether the platform can land in the stack without creating new risk.

This is a different integration than the detection tools a CIO has already vetted. Hesper AI sits downstream of detection - it does not flag claims, it investigates the ones already flagged - so the shape is flag-in, audit-ready-report-back, not a new scoring engine wired into FNOL. The positioning is "from fraud detection to fraud resolution," and the resolution layer is the one this checklist is built to evaluate. For the full definition of that layer and how it differs from detection, see our pillar on autonomous AI claims investigation.

What follows is the checklist itself, organized as five gates a CIO can run in an afternoon: data security, integrations and identity, rollout phasing, contracts, and AI-specific governance. Each gate ends in a yes-or-no the CIO can hand straight to Compliance and Legal. The one-page version is at the end.

Section 1 - Data security and compliance

Security is the first gate because it is the one that fails deals silently. A claims investigation platform handles claimant PII and sometimes PHI, plus the full evidentiary record of a fraud case. The CIO is verifying that the data is encrypted, isolated, kept in a defined jurisdiction, and never repurposed. Three sub-checks cover it.

SOC 2 Type II and ISO 27001 are the floor

SOC 2 Type II is table stakes, not a differentiator. SOC 2 examines a vendor's controls against the five Trust Services Criteria defined by the AICPA: Security, Availability, Processing Integrity, Confidentiality, and Privacy. The distinction the CIO has to insist on is Type II over Type I. Type I confirms the controls were designed at a single point in time; Type II confirms they were tested over a period, typically 6 to 12 months. Ask for the most recent Type II report and a bridge letter covering the gap from the report date to today - not a logo on a trust page. ISO 27001 alongside SOC 2 confirms a managed information-security program rather than a one-off audit.

SOC 2 is the floor, not the ceiling. We break down the specific data-handling controls a carrier should verify in SOC 2 and data handling for AI fraud investigation, which the security section of this checklist sits on top of. Treat that post as the deep reference and this section as the gate.

Encryption, PII handling, and data residency

The CIO needs the data-flow diagram, not a sentence. Confirm encryption in transit (TLS 1.2 or higher) and at rest (AES-256 or equivalent), and confirm how claimant PII and PHI are handled inside the platform - field-level treatment, masking where the investigation does not need raw identifiers, and retention windows. Confirm US data residency, with the actual cloud region named, and confirm the tenancy model: a single-tenant or logically isolated multi-tenant architecture, stated explicitly, not implied. Ask for the named subprocessor list - every third party that touches the data, including the foundation-model provider - because that list is what Legal attaches to the DPA. A vendor that cannot produce a one-page data-flow diagram on request is a vendor whose data flow you do not yet understand.

The training-on-customer-data clause

The single most common CIO veto is the training-on-customer-data clause, and it is worth checking before anything else in the contract. Many AI vendors reserve the right to use customer data to improve or train shared and foundation models. For carrier claim data - PII, PHI, evidentiary material - that is a non-starter. The contract must state, in plain language, that the vendor does not train shared or foundation models on carrier claim data, that tenant isolation is contractual rather than aspirational, and that any model improvement happens on the carrier's tenant or not at all. This is the clause a generic SaaS security checklist tends to omit, and it is the one that protects the carrier's data from becoming a competitor carrier's signal.

Type I and Type II are not interchangeable

A Type I SOC 2 report says the controls were designed correctly on one day. A Type II report says they operated correctly across a 6-to-12-month window. For a platform sitting inside your live claims workflow, only Type II tells you the controls hold under load. Always ask which type, and always ask for the report plus a current bridge letter.

Section 2 - Integrations and identity

The second gate is where the platform actually meets the stack. The CIO is confirming the integration shape, the identity model, and how the platform coexists with detection vendors already in place. The headline for an investigation layer is that the surface is small: it consumes a flagged claim and returns a report, so it does not re-architect the claims system of record.

Claims and policy admin: ClaimCenter, Duck Creek, FNOL feeds

The integration shape is flag-in, audit-ready-report-back. A flagged claim flows from Guidewire ClaimCenter - or from a detection vendor like FRISS or Shift - into the investigation platform; the platform runs its investigation; and an audit-ready report flows back into ClaimCenter as a case attachment. ClaimCenter stays the system of record. Guidewire supports this through its Marketplace of validated partner solutions plus ClaimCenter APIs and SDKs, so the integration is a configured connection, not a custom rebuild. Duck Creek Claims integrates the same way as a cloud-native target. Under the hood, Hesper runs 15+ investigation phases in parallel on each flagged claim and returns a single report the human SIU lead reviews - so what lands back in the case file is one reviewable artifact, not 15 separate outputs to reconcile.

Because the surface is small, the integration costs are smaller than a core-system project - but they are not zero. We map the categories a CIO should price before signing in the hidden integration costs of adding AI to legacy claims suites. A CIO asking "why not just the ClaimCenter or Duck Creek AI add-on?" should weigh that those suite modules carry their own integration, customization, and lock-in costs, and none of them ship the autonomous investigation layer. That layer has no other named software vendor; the incumbent is the manual SIU team.

SSO/SAML, SCIM, and least-privilege access

Identity is non-negotiable. The platform must support SAML or OIDC single sign-on so investigators authenticate through the carrier's existing IdP, and SCIM provisioning so accounts are created and deprovisioned automatically when HR changes the roster. Investigator access should be role-scoped to least privilege - an investigator sees the cases in their queue, not the full claim population - and every access event should be logged. Confirm session controls, MFA enforcement at the IdP, and that there is no shared service account fronting investigator activity. If the platform cannot ride the carrier's SSO and SCIM, it becomes a standing identity-management liability, not just an inconvenience.

APIs, webhooks, and detection-vendor coexistence

REST APIs and webhooks are what let the investigation layer sit alongside detection vendors without disturbing them. The flag can arrive from the detection vendor or from claims-system triage; the report can be pushed back via webhook the moment it is ready. This is the coexistence point a CIO who has already integrated FRISS or Shift cares about: the investigation platform is complementary to FRISS, Shift Technology, and Verisk - not a replacement. It consumes their output rather than competing for the detection slot. Note one distinction worth keeping straight: Shift's recent agentic AI is handler-assist for adjusters, which sits upstream of an autonomous investigation layer, so the two are adjacent, not the same. Adding investigation does not touch existing detection contracts or integrations.

Integration concern	What to confirm	Why it matters to the CIO
Claims system of record	Flag-in / report-back via Marketplace + APIs (ClaimCenter, Duck Creek)	No rip-and-replace; ClaimCenter stays system of record
Identity	SAML/OIDC SSO + SCIM provisioning + role-scoped access	Investigators ride the carrier IdP; clean deprovisioning
Connectivity	REST APIs + webhooks	Flag arrives from detection or triage; report pushed on completion
Detection coexistence	Sits downstream of FRISS / Shift / Verisk	Adds investigation layer without disturbing detection contracts
Output	Single audit-ready report as a case attachment	One reviewable artifact, not 15 raw outputs

Section 3 - Rollout phasing and change management

The third gate is how the platform goes live without disrupting the claims floor. The CIO and SIU director should agree on a bounded pilot before any production cutover, and the rollout plan should account for the role change on the investigator side, not just the technical cutover.

Pilot to production

Scope a bounded pilot: one line of business and a defined cohort of flagged claims, with success criteria the CIO and SIU director set in advance. The pilot validates integration, data flow, security posture, and output quality against those criteria before any production cutover. Because the platform consumes flagged claims and returns reports rather than replacing the claims system, the integration surface is smaller than a core-system migration, so the pilot can be measured in the quality of the output rather than the survival of the cutover. Size the pilot against real capacity - an AI investigation layer can take an investigator from roughly 10 manual investigations a month toward 800+ - and use the measured lift to scope production. For the budget, timeline, and change-management view from the business side, see the Claims VP deployment playbook.

Change management for adjusters and investigators

The role change is the part a technical cutover plan tends to miss. The investigator's role shifts from execution to decision-making: instead of spending 14+ days assembling a case by hand, the investigator reviews an audit-ready draft, overrides where judgment differs, and signs the decision. That is less work in aggregate, but it is different work, and the rollout plan must not double cognitive load during transition by asking investigators to run both the manual workflow and the new one in parallel for the whole pilot. Train on the review-and-override workflow, keep the human SIU lead clearly in the approval seat, and stage the cutover so investigators trust the output before they depend on it.

The investigation platform is a small integration surface and a large workflow change. The CIO who scopes only the integration and not the role shift is solving the easy half of the rollout.
Hesper AI product research

Section 4 - Contracts: MSA, DPA, SLA

The fourth gate is where the CIO hands the file to Legal, but the CIO sets the requirements. Three documents carry the weight: the MSA, the DPA, and the SLA. Each has a small number of terms that, for claims data specifically, a generic SaaS contract gets wrong.

MSA and DPA essentials

The DPA should name every subprocessor, commit to US data residency, set a breach-notification window measured in hours not days, and require deletion of carrier data on termination with a deletion certificate. The MSA should state explicitly that the carrier owns the investigation outputs - both the report and the underlying evidence the agent assembled - and that the vendor does not train shared or foundation models on carrier claim data. IP ownership of outputs is the term most often left ambiguous, and ambiguity there is what lets a vendor argue the evidence record is theirs. For a fraud case that may end in an EUO, a SAR filing, or litigation, the carrier needs the output to be unambiguously its own.

SLA, uptime, and liability

The SLA should carry an uptime commitment with defined RTO and RPO, support tiers with named response times, and a liability framework sized to the sensitivity of the claims data being processed. Because the platform sits in the investigation workflow rather than the FNOL path, a brief outage does not stop new claims from being filed - but it does delay investigations, so the uptime target should reflect the carrier's tolerance for that delay. Confirm the liability cap is not a token figure relative to the volume and sensitivity of the PII and PHI flowing through the platform; a cap set at one month of fees is not proportionate to a claims-data breach.

Exit and data portability

Define the exit path before signing, not at renewal. The contract should guarantee export of all investigation reports and underlying evidence in a portable, documented format, plus a deletion certificate for the carrier's data on the vendor's side. A documented exit path is what keeps the carrier from being locked in: if the platform is replaced, the investigation record - which has regulatory and evidentiary value - travels with the carrier. This is also a governance requirement, because the antifraud-plan record the carrier files cannot live somewhere the carrier cannot retrieve it.

Section 5 - AI-specific governance

The fifth gate is the one a horizontal AI-vendor checklist misses entirely, and it is the one Compliance and Legal care about most. For an AI system making or supporting decisions on insurance claims, the governance question is whether the output is auditable, explainable, and accountable under insurance-specific regulation. Two sub-checks cover it.

Auditability, explainability, human-in-the-loop

Every agent decision must be logged with its sources, its reasoning, and timestamps, and a human SIU lead must review and be able to override before any action. This is what makes the output defensible: a reconstructable decision trail that a state DOI examiner can read end to end. That audit-trail-native design is what lets the output satisfy California's 10 CCR 2698.36 documented-decision requirement and appear in an antifraud plan filed under NAIC Model Act #680, which is adopted in 48 states. For the CIO, this is a procurement requirement, not a nice-to-have - a black-box conclusion with no supporting evidence is exactly what Legal and Compliance veto. Ask the vendor to show a sample report: the evidence chain, the cited sources, the reasoning behind the recommendation, and confirmation that an investigator can edit and sign off before it leaves the system.

NAIC AI Model Bulletin and bias/fairness

The accountability stays with the insurer, even when the AI is built and operated by a vendor. The NAIC adopted its Model Bulletin on the Use of Artificial Intelligence Systems by Insurers on December 4, 2023, and more than 20 states have adopted or are pursuing it. It requires insurers to maintain a written AIS Program commensurate with risk, to ensure AI-supported decisions comply with unfair-trade-practice and other laws, to oversee third-party vendors through contracting and testing-validation protocols, and to keep documentation regulators can request in an examination. For the CIO, that means the platform must feed the carrier's AIS Program: logged decisions, vendor testing and validation evidence, and retrievable documentation. A vendor that cannot produce that record creates a compliance gap the carrier - not the vendor - answers for.

The carrier owns the AI accountability

Under the NAIC Model Bulletin, the insurer is accountable for AI-supported decisions even when a vendor builds and runs the model. The CIO's governance check is therefore not 'does the vendor have AI governance' - it is 'does this platform produce the documentation our written AIS Program needs to be defensible in a DOI examination.' Logged, explainable, human-reviewed output is what answers that.

The one-page CIO checklist

Consolidated, the five gates run as a yes-or-no checklist a CIO can complete in an afternoon and hand to Compliance and Legal. Anything that resolves to no on data flow, SOC 2, the training clause, or the audit trail is a hard stop until it resolves to yes.

Gate	Checklist item	Pass condition
Security	SOC 2 Type II + ISO 27001	Current Type II report + bridge letter on file
Security	Encryption + US data residency + named subprocessors	Data-flow diagram and subprocessor list provided
Security	No training on carrier claim data	Explicit contractual ban + tenant isolation
Integrations	Flag-in / report-back to ClaimCenter or Duck Creek	Marketplace/API integration; suite stays system of record
Integrations	SAML/OIDC SSO + SCIM + least-privilege	Rides carrier IdP; access logged
Integrations	Coexists with FRISS / Shift / Verisk	API/webhook; detection contracts untouched
Rollout	Bounded pilot with agreed success criteria	One line of business; criteria set with SIU director
Rollout	Change-management plan for the role shift	No doubled cognitive load during transition
Contracts	MSA: carrier owns outputs + evidence	IP ownership unambiguous in writing
Contracts	DPA: subprocessors, residency, breach window, deletion	Deletion certificate on termination
Contracts	SLA: uptime, RTO/RPO, proportionate liability	Liability cap sized to claims-data sensitivity
Contracts	Documented exit and data portability	Portable export of reports + evidence
Governance	Logged, explainable, human-in-the-loop decisions	Reconstructable trail; investigator override + sign-off
Governance	Feeds the carrier's written AIS Program	Retrievable documentation for DOI examination

This checklist is the technical-buyer companion to the SIU-leader version. Where this one front-loads data flow, contracts, and governance, the 12-point evaluation checklist for SIU leaders front-loads case quality, override workflow, and audit-trail depth. Run both and the carrier has covered the technical and the operational veto points before signature.

Key takeaways

The CIO reviews an AI claims investigation platform last but can veto on a single unresolved data-flow or SOC 2 gap, so the checklist front-loads security.
SOC 2 Type II across the five Trust Services Criteria is the floor, and the decisive contract term is an explicit ban on training shared models on carrier claim data.
The integration shape is flag-in, audit-ready-report-back through ClaimCenter or Duck Creek APIs - an added investigation layer, not a core-system replacement, and complementary to FRISS, Shift, and Verisk.
Under the NAIC AI Model Bulletin, adopted December 4, 2023 and pursued by more than 20 states, the carrier is accountable for vendor AI, so the platform must feed the written AIS Program with retrievable documentation.
The MSA, DPA, and SLA must lock down subprocessors, IP ownership of outputs, data residency, proportionate liability, and a documented exit path before signature.

Frequently asked questions

Run five gates. First, security: SOC 2 Type II and ISO 27001, encryption in transit and at rest, US data residency, and a contractual ban on training shared models on your claim data. Second, integrations: confirm the platform reads flagged claims from Guidewire ClaimCenter or Duck Creek and writes its report back as a case attachment, with SAML SSO, SCIM provisioning, and REST APIs or webhooks. Third, rollout: a bounded pilot on one line of business with agreed success criteria. Fourth, contracts: an MSA and DPA covering subprocessors, IP ownership of outputs, SLA, liability, and a documented exit path. Fifth, AI governance: logged, explainable decisions with human-in-the-loop review that feed your written AIS Program under the NAIC AI Model Bulletin. Anything unresolved on data flow or SOC 2 is a hard stop.

SOC 2 examines controls against five Trust Services Criteria defined by the AICPA: Security, Availability, Processing Integrity, Confidentiality, and Privacy. For a claims investigation platform handling claimant PII and sometimes PHI, Security, Confidentiality, and Privacy are non-negotiable, and Availability matters because the platform sits in your claims workflow. Ask for SOC 2 Type II, not Type I: Type II means the controls were tested over a period, typically 6 to 12 months, not just designed at a point in time. Request the most recent report and a bridge letter covering the gap to today. SOC 2 is the floor, not the ceiling - pair it with ISO 27001, a named subprocessor list, and a no-training-on-customer-data clause in the contract.

Through the Guidewire Marketplace and ClaimCenter's APIs and SDKs rather than a rip-and-replace. The integration shape is flag-in, report-back: a flagged claim flows from ClaimCenter, or from a detection vendor like FRISS or Shift, into the investigation platform; the platform runs its investigation; and an audit-ready report flows back into ClaimCenter as a case attachment. ClaimCenter remains the system of record. Hesper runs 15+ investigation phases in parallel on each flagged claim and returns a single report a human SIU lead reviews. Duck Creek Claims integrates the same way as a cloud-native target. Because the platform sits downstream of detection, it coexists with FRISS, Shift, and Verisk rather than replacing them, so the CIO is adding an investigation layer, not re-architecting the claims stack.

The accountability stays with the insurer, even when the AI is built or operated by a vendor. The NAIC adopted the Model Bulletin on the Use of Artificial Intelligence Systems by Insurers on December 4, 2023, and more than 20 states have adopted or are pursuing it. It requires insurers to maintain a written AIS Program commensurate with risk, to ensure AI-supported decisions comply with unfair-trade-practice and other laws, to oversee third-party vendors through contracting and testing-validation protocols, and to keep documentation regulators can request in an examination. For a CIO, that means the platform must feed the carrier's AIS Program: logged decisions, vendor testing evidence, and retrievable documentation. A vendor that cannot produce that record creates a compliance gap the carrier, not the vendor, answers for.

The DPA should name every subprocessor, commit to US data residency, set a breach-notification window, and require deletion of carrier data on termination. The MSA should state explicitly that the carrier owns the investigation outputs - the report and the underlying evidence - and that the vendor will not train shared or foundation models on carrier claim data. Add an SLA with an uptime commitment and defined RTO and RPO, support tiers, and a liability framework sized to the sensitivity of the claims data. Critically, define the exit path: export of reports and evidence in a portable format plus a deletion certificate, so the carrier is never locked in. These terms are where a generic SaaS contract falls short of what insurance claims data requires.

Plan in two phases. A bounded pilot - one line of business and a defined flagged-claim cohort - lets the CIO and SIU director validate integration, data flow, security posture, and output quality against agreed success criteria before any production cutover. Because the platform consumes flagged claims and returns reports rather than replacing the claims system, the integration surface is smaller than a core-system project. The bigger variable is change management: the investigator's role shifts from execution to decision-making, so the plan should avoid doubling adjuster cognitive load during transition. Size the pilot against real capacity - an AI investigation layer can take an investigator from roughly 10 manual investigations a month toward 800+ - and use the measured lift to scope production. Carriers on pre-cloud mainframe claims stacks should expect longer timelines.

Yes, and that is the modal deployment. FRISS, Shift, and Verisk operate at the detection layer: they flag suspicious claims. An autonomous investigation platform sits downstream and resolves the flag. The integration is API- or webhook-based, so the flag can arrive from the detection vendor or from claims-system triage. Hesper is complementary to FRISS, Shift Technology, and Verisk - not a replacement. For a CIO, this means the AI investigation layer does not disturb existing detection contracts or integrations; it consumes their output. The investigation layer is the one no detection vendor occupies - the incumbent there is the manual SIU team running 14+ days per case at roughly 25% coverage of flagged claims. Adding investigation moves coverage toward 100% without re-architecting detection.

A reconstructable decision trail. Every step the agent takes should be logged with its sources, its reasoning, and timestamps, and a human SIU lead should review and be able to override before any action. That record is what lets the output satisfy California's 10 CCR 2698.36 documented-decision requirement and appear in an antifraud plan filed under NAIC Model Act #680, adopted in 48 states. For the CIO, defensibility is a procurement requirement, not a nice-to-have: a black-box conclusion with no supporting evidence is what Legal and Compliance veto. Ask the vendor to show a sample report - the evidence chain, the cited sources, the reasoning behind the recommendation - and confirm an investigator can edit and sign off on it before it leaves the system.