AI Hallucination on Consumer Duty for Accountants (CA/PA) in the United Kingdom

<- Take me back to my Accountants (CA/PA) (UK) overview

Executive Summary

Accountants (CA/PA) supporting fair-value assessment work for FCA-regulated retail products operate inside FG22/5's qualitative-assessment standard, which expressly does not require quantification of non-monetary costs and benefits. Across two findings in this cell, frontier AI models tested with web search inverted this position: Claude Opus 4.7 affirmed that quantification 'is encouraged where feasible', and Claude Sonnet 4.6 elevated the standard to 'substantiated comparisons'. Both reconstructions contradict FG22/5's published text, and both would, if relied on by accountants supporting fair-value reviews, push the firm to build a methodology that exceeds what the FCA has expressly required.

How AI gets this regulation wrong

The two findings in this cell are inference drift on a clear FCA published position. The models tested committed to specific methodological answers, quantification expected, substantiated comparison required, where the regulator's text is the precise opposite. The errors are particularly damaging in an accounting context because they convert a 'qualitative is sufficient' standard into a more demanding analytical bar.

AI's Failure Mode	Count	Affected findings
Inference Drift	1	Finding#1
Inference Drift	1	Finding#2

What that means for your practice

For accountants supporting fair-value assessment work, both findings convert into the same operational risk: a fair-value template that demands quantitative or comparison-based non-monetary analysis the FCA has expressly not required. The over-build cost runs through additional analyst time, additional template length, and additional review cycles for product approvals.

Risk Impact	Count	Affected findings
Regulatory enforcement / professional liability exposure	2	Finding#1 · Finding#2

When this affects Accountants (CA/PA)

Accountants (CA/PA) supporting fair-value assessment work for FCA-regulated retail products encounter the Consumer Duty most often at the design and review stages of the fair-value workstream. Templates, review checklists, and product-approval committee submissions all turn on FG22/5's published methodology for non-monetary cost and benefit assessment, and accountants are increasingly using AI tools to surface the methodology requirements at the design stage.

The findings in this cell show that AI tools are unreliable precisely at the methodology-specification stage. Asked about the FCA's expectation on quantification, the Opus 4.7 model affirms that quantification 'is encouraged where feasible'; asked the same question, the Sonnet 4.6 model elevates the standard to 'substantiated comparisons'. FG22/5's actual text says the opposite: the FCA does not expect firms to quantify non-monetary costs and benefits, and a qualitative assessment is sufficient. An accountant who imports either model's framing into the firm's fair-value template builds a quantitative or comparison-based analytical layer the regulator has expressly disclaimed.

The findings at a glance

The table below lists each finding from the Consumer Duty testing in this cell, showing the question area, the AI's failure mode, and the citation identifier for the underlying finding record.

#	Finding title	Type	Citation ID
1	Inverted FG22/5 on fair-value quantification for non-monetary benefits	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47
2	Imposed substantiated-comparison expectation FG22/5 does not require	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46

Aggregate impact

The two findings cluster on the same FG22/5 passage on fair-value methodology, with two different models producing two different surface forms of the same inversion. The Opus 4.7 finding affirms quantification 'where feasible'; the Sonnet 4.6 finding requires 'substantiated comparisons'. Both contradict FG22/5's published 'qualitative is sufficient' standard.

The pattern is structurally consequential for the accounting profession. The two-model agreement on the same wrong direction is a strong signal that the FCA's clean negative on quantification is poorly represented in the models' training data, or is systematically overridden by inference from general fair-value norms in other regulatory contexts. An accountant relying on either model's answer is not making a one-model error; they are encountering a structural failure in how AI tools handle this specific FCA position.

The implication for fair-value methodology design is that AI-assisted research on FG22/5 cannot be relied on without source-text verification. The over-build cost of running a quantitative or comparison-based non-monetary analysis when the regulator has said qualitative is sufficient is real, and the time and template length it adds are not recoverable through any compliance benefit.

What your team should do

Accountants supporting fair-value assessment work should treat AI tools as a starting point for methodology framing, not as a source of FG22/5 methodology decisions. Any output that describes the FCA's expectation on quantification, qualitative assessment, or comparison-based analysis requires direct verification against the FG22/5 passage before it can be built into a template, review process, or product spec.

For practical safeguards on fair-value work: when an AI tool affirms a quantification or comparison expectation under FG22/5, treat that affirmation as a structural failure mode and verify against the published FG22/5 paragraph. The relevant passage is short and unambiguous: 'The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment.'

Where AI tools are most safely used: framing the structure of a fair-value template, identifying which categories of non-monetary cost and benefit should be assessed, drafting accountant-facing summaries of the methodology for review against the source text, and surfacing cross-references between FG22/5 and adjacent FCA expectations. The risk concentrates in the next step, where the AI is asked to specify the actual methodological standard. At that point FG22/5 is the only reliable input.

How RLB Can Help

RegLeg's published Hallucination Research gives UK accountants supporting fair-value work a free pre-flight check before relying on AI tools. The research surfaces which questions about FG22/5's methodology have generated confident but inverted AI output, letting the accountant verify the relevant passage before building the template.

Beyond the published research, RegLeg works with UK firms on bespoke deep-dives mapping AI-supported fair-value workflows to their actual hallucination exposure, and develops CPD-aligned material covering how to interpret AI-generated FG22/5 summaries critically.