AI Hallucination ResearchAudiencesPractitionersUnited KingdomAccountants (CA/PA) › Consumer Duty (PS22/9 + PRIN 2A)
Practitioners — Accountants (CA/PA) · updated 2026-06-11 · methodology v2.3
Share / Print Twitter LinkedIn Email

AI Hallucination on Consumer Duty for Accountants (CA/PA) in the United Kingdom

Accountants (CA/PA): AI summaries of Consumer Duty may understate professional obligations

Chartered and public accountants engaged on Consumer Duty fair-value reporting are increasingly using AI to validate fair-value assessment methodology, draft committee-ready summaries of non-monetary benefit analysis, and prepare audit-evidence memos that reconcile the firm's pricing rationale against the FCA's stated expectations. The work feeds directly into audit-file memos, fair-value attestation packs, and board-paper assertions that an external auditor will revisit.

Two frontier AI models tested by the RLB Specialist Panel produced 2 substantive failures on this regulation under audit conditions. The failure classes recorded are: Inference Drift on Fair Value Quantification Expectation, Inference Drift on Required Depth of Non-Monetary Analysis. Questions were prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for, and each finding is bound to verbatim regulator-issued source text held as primary substrate.

The Consumer Duty (PS22/9 introducing Principle 12 and PRIN 2A, in force for open products from 31 July 2023 and for closed products from 31 July 2024) is the central retail-conduct regime the FCA now uses to grade firm behaviour, and the failure modes seen here all land inside the day-to-day work product that accountants sign off on.

For accountants, the operational consequence is direct. A fair-value attestation, an audit memo, or a fair-value methodology review built on the AI's framing imports a defect into audit evidence. The next ICAEW or PCAOB-equivalent file review, a regulatory enquiry, or a client's internal-audit pull will surface the gap, and the accountant carries the professional-quality exposure.

Citation IDs for the findings in this brief: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47, RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46. Each citation links to the per-finding record, the AI subject answer, and the regulator-issued substrate excerpt the answer was tested against. The RLB Specialist Panel maintains an audit-traceable record of which model produced which answer, against which substrate passage, and the binding is what makes the finding referenceable in firm work product and in supervisory correspondence.

The findings below are the ones that accountants working under the Consumer Duty are most likely to encounter in the AI tools they already use, and the briefing sections that follow read each finding against the regulator-issued text.

<- Take me back to my Accountants (CA/PA) (UK) overview

Executive Summary

Accountants (CA/PA) supporting fair-value assessment work for FCA-regulated retail products operate inside FG22/5's qualitative-assessment standard, which expressly does not require quantification of non-monetary costs and benefits. Across two findings in this cell, frontier AI models tested with web search inverted this position: Claude Opus 4.7 affirmed that quantification 'is encouraged where feasible', and Claude Sonnet 4.6 elevated the standard to 'substantiated comparisons'. Both reconstructions contradict FG22/5's published text, and both would, if relied on by accountants supporting fair-value reviews, push the firm to build a methodology that exceeds what the FCA has expressly required.

How AI gets this regulation wrong

The two findings in this cell are inference drift on a clear FCA published position. The models tested committed to specific methodological answers, quantification expected, substantiated comparison required, where the regulator's text is the precise opposite. The errors are particularly damaging in an accounting context because they convert a 'qualitative is sufficient' standard into a more demanding analytical bar.

AI's Failure ModeCountAffected findings
Inference Drift1Finding#1
Inference Drift1Finding#2

What that means for your practice

For accountants supporting fair-value assessment work, both findings convert into the same operational risk: a fair-value template that demands quantitative or comparison-based non-monetary analysis the FCA has expressly not required. The over-build cost runs through additional analyst time, additional template length, and additional review cycles for product approvals.

Risk ImpactCountAffected findings
Regulatory enforcement / professional liability exposure2Finding#1 · Finding#2

When this affects Accountants (CA/PA)

Accountants (CA/PA) supporting fair-value assessment work for FCA-regulated retail products encounter the Consumer Duty most often at the design and review stages of the fair-value workstream. Templates, review checklists, and product-approval committee submissions all turn on FG22/5's published methodology for non-monetary cost and benefit assessment, and accountants are increasingly using AI tools to surface the methodology requirements at the design stage.

The findings in this cell show that AI tools are unreliable precisely at the methodology-specification stage. Asked about the FCA's expectation on quantification, the Opus 4.7 model affirms that quantification 'is encouraged where feasible'; asked the same question, the Sonnet 4.6 model elevates the standard to 'substantiated comparisons'. FG22/5's actual text says the opposite: the FCA does not expect firms to quantify non-monetary costs and benefits, and a qualitative assessment is sufficient. An accountant who imports either model's framing into the firm's fair-value template builds a quantitative or comparison-based analytical layer the regulator has expressly disclaimed.

The findings at a glance

The table below lists each finding from the Consumer Duty testing in this cell, showing the question area, the AI's failure mode, and the citation identifier for the underlying finding record.

#Finding titleTypeCitation ID
1Inverted FG22/5 on fair-value quantification for non-monetary benefitsHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47
2Imposed substantiated-comparison expectation FG22/5 does not requireHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46

Aggregate impact

The two findings cluster on the same FG22/5 passage on fair-value methodology, with two different models producing two different surface forms of the same inversion. The Opus 4.7 finding affirms quantification 'where feasible'; the Sonnet 4.6 finding requires 'substantiated comparisons'. Both contradict FG22/5's published 'qualitative is sufficient' standard.

The pattern is structurally consequential for the accounting profession. The two-model agreement on the same wrong direction is a strong signal that the FCA's clean negative on quantification is poorly represented in the models' training data, or is systematically overridden by inference from general fair-value norms in other regulatory contexts. An accountant relying on either model's answer is not making a one-model error; they are encountering a structural failure in how AI tools handle this specific FCA position.

The implication for fair-value methodology design is that AI-assisted research on FG22/5 cannot be relied on without source-text verification. The over-build cost of running a quantitative or comparison-based non-monetary analysis when the regulator has said qualitative is sufficient is real, and the time and template length it adds are not recoverable through any compliance benefit.

What your team should do

Accountants supporting fair-value assessment work should treat AI tools as a starting point for methodology framing, not as a source of FG22/5 methodology decisions. Any output that describes the FCA's expectation on quantification, qualitative assessment, or comparison-based analysis requires direct verification against the FG22/5 passage before it can be built into a template, review process, or product spec.

For practical safeguards on fair-value work: when an AI tool affirms a quantification or comparison expectation under FG22/5, treat that affirmation as a structural failure mode and verify against the published FG22/5 paragraph. The relevant passage is short and unambiguous: 'The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment.'

Where AI tools are most safely used: framing the structure of a fair-value template, identifying which categories of non-monetary cost and benefit should be assessed, drafting accountant-facing summaries of the methodology for review against the source text, and surfacing cross-references between FG22/5 and adjacent FCA expectations. The risk concentrates in the next step, where the AI is asked to specify the actual methodological standard. At that point FG22/5 is the only reliable input.

How RLB Can Help

RegLeg's published Hallucination Research gives UK accountants supporting fair-value work a free pre-flight check before relying on AI tools. The research surfaces which questions about FG22/5's methodology have generated confident but inverted AI output, letting the accountant verify the relevant passage before building the template.

Beyond the published research, RegLeg works with UK firms on bespoke deep-dives mapping AI-supported fair-value workflows to their actual hallucination exposure, and develops CPD-aligned material covering how to interpret AI-generated FG22/5 summaries critically.

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.