Sonnet maps the geometry of hallucinations buried deep inside FCA Consumer Duty doctrine.
— RLB Specialist Panel
Inference Drift on Fair Value Quantification Expectation, Inference Drift on Required Depth of Non-Monetary Analysis: Consumer Duty (PS22/9 + PRIN 2A) under audit.
Two frontier AI models tested by the RLB Specialist Panel produced 2 substantive failures on the Consumer Duty, with material implications for the work product of accountants.
Frontier AI models, asked questions of the kind accountants put to them on the Consumer Duty in real workflows, produce confident answers that drift from the regulator's actual position on Principle 12, PRIN 2A, and the FCA's Feedback Statement record. The failure classes seen are: Inference Drift on Fair Value Quantification Expectation, Inference Drift on Required Depth of Non-Monetary Analysis.
Questions were prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for. Each question is paired with verbatim regulator-issued source text held as primary substrate, against which the AI subject answer is graded. Two frontier AI models were the subjects under test on this regulation. The panel binds each finding to the substrate excerpt it tests against; the binding is what makes each finding referenceable and audit-traceable.
Citation: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47. Model under test: Claude Opus 4.7. Failure mode: inference drift.
Question put to the model: Whether the Consumer Duty requires firms to quantify non-monetary benefits as part of a fair value assessment, and what methodology the FCA expects.
What the model answered: The model stated quantification is encouraged where feasible and that qualitative assessment is acceptable where quantification is impractical, provided reasoning is robust.
Regulator-issued position (verbatim): "The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment."
Reading: The regulator's actual position is that quantification of non-monetary items is NOT expected. The model's framing of quantification as the preferred path with qualitative as a fallback inverts the FCA's express position and would push firms toward unnecessary financial modelling exercises.
Citation: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46. Model under test: Claude Sonnet 4.6. Failure mode: inference drift.
Question put to the model: Whether the Consumer Duty requires firms to quantify non-monetary benefits in their fair value assessment, and what level of analysis the FCA expects.
What the model answered: The model stated the FCA does not mandate a single financial methodology but expects firms to go beyond qualitative description and provide substantiated comparisons.
Regulator-issued position (verbatim): "From FCA guidance: 'The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment.'"
Reading: The regulator's documented guidance accepts qualitative assessment as sufficient. The model's drift from that floor to a 'substantiated comparison' standard pushes firms above what the FCA actually requires and creates an artificial gap between practice and rule.
For accountants, the operational consequence is direct. A fair-value attestation, an audit memo, or a fair-value methodology review built on the AI's framing imports a defect into audit evidence. The next ICAEW or PCAOB-equivalent file review, a regulatory enquiry, or a client's internal-audit pull will surface the gap, and the accountant carries the professional-quality exposure.
The failures recorded here are not stylistic. Each one would, if relied on, shift the firm's documented position on a specific Consumer Duty obligation: scope of application, foreseeable-harm safe harbour, fair-value methodology, or the current status of pre-Consumer Duty supervisory expectations. The work product of accountants sits between the firm and the regulator, and it has to track the rule as written.
On the question of quantification of non-monetary benefits in fair value assessment: "The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment." (source: regulator-issued primary substrate held by the RLB Specialist Panel; citation RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47).
On the question of level of fair value analysis fca expects on non-monetary items: "From FCA guidance: 'The FCA does not expect firms to quantify non-monetary costs and benefits as part of its fair value assessment process, but firms should undertake some form of qualitative assessment.'" (source: regulator-issued primary substrate held by the RLB Specialist Panel; citation RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46).
Frontier AI models are useful drafting partners for accountants, but they are not a substitute for the rule text. The failure patterns recorded on the Consumer Duty cluster around three lenses. First, scope drift, in which the model misstates what the rule covers, illustrated here by the reversed group-insurance exclusion under PRIN 2A and the silent omission of FSMA 2023 from the statutory architecture answer. Second, methodology drift, in which the model elevates guidance (FG22/5) to rule status (PRIN 2A) or imports a stricter expectation than the regulator sets, illustrated by the non-monetary quantification framing the FCA expressly disavowed.
Third, evidence-avoidance, in which the model refuses to commit on a question that the regulator has answered in plain text in a documented Feedback Statement, illustrated here by the FS25/2 Dear CEO letter retirement count.
For accountants, the practical reading is: AI output on the Consumer Duty needs to be checked against verbatim substrate (PRIN 2A, PS22/9, FG22/5, FS25/2) before it lands in a work product the firm or the regulator will rely on. The model's confidence is not a reliable signal of accuracy on this regulation, because the failures recorded are confident-wrong, not hesitant-wrong.
The Consumer Duty is a regime the FCA grades in writing through Feedback Statements, supervisory correspondence, and thematic-review outputs, and the rule text plus the FCA's documented Feedback record together are what an audit-traceable AI use of the regulation has to sit on.
The RLB Specialist Panel runs structured audits of frontier AI models against high-stakes regulator-issued texts. Each finding on the Consumer Duty is bound to the regulator-issued source it tests against, recorded with the AI subject's answer in full, and held with the verbatim regulator-issued passage that grades the answer. The Panel offers AI labs a partnership channel: regulator-substrate-bound finding sets, audience-tagged per profession and per sector-department, that feed into model evaluation pipelines and post-training reinforcement signals.
Firms with a stake in the work product (legal, compliance, risk, product, board secretariat) get a referenceable finding library on the regulations they live with. Every finding on this regulation is held with the question, the AI subject's full answer, the regulator-issued passage that grades the answer, and the model identity, so the record can be audited end-to-end and the binding can be re-verified at any time.
Chartered and public accountants engaged on Consumer Duty fair-value reporting should, on every AI-drafted audit memo or attestation work product, take the following discipline:
These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.
Primary source verified: FCA PS22/9 + PRIN 2A + FG22/5 · Substrate documents: p_05_REGULATION_FG22_5___Fair_value_assessment__no_quant_2.html · FCA portal: fca.org.uk
Citation IDs referenced:
RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46