AI Hallucination ResearchAudiencesPractitionersUnited KingdomFinancial Advisers › Consumer Duty (PS22/9 + PRIN 2A)
Practitioners — Financial Advisers · updated 2026-06-11 · methodology v2.3
Share / Print Twitter LinkedIn Email

AI Hallucination on Consumer Duty for Financial Advisers in the United Kingdom

Financial Advisers: AI summaries of Consumer Duty may understate professional obligations

Financial advisers operating under the Consumer Duty are increasingly using AI to validate suitability narratives, draft client-facing fair value rationales for retained-product reviews, generate compliance file-notes against PRIN 2A.4, and stress-test investor disclosures against the FCA's stated expectations. The work product feeds directly into client-facing letters, advice records, and product-governance documentation that the regulator can pull on a thematic review.

Two frontier AI models tested by the RLB Specialist Panel produced 5 substantive failures on this regulation under audit conditions. The failure classes recorded are: Inference Drift on the Foreseeable-Harm Safe Harbour, Confused Guidance with Rule on Consumer Testing, Inference Drift on Fair Value Quantification Expectation, Inference Drift on Required Depth of Non-Monetary Analysis, Reversed the PRIN 2A Group-Insurance Exclusion. Questions were prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for, and each finding is bound to verbatim regulator-issued source text held as primary substrate.

The Consumer Duty (PS22/9 introducing Principle 12 and PRIN 2A, in force for open products from 31 July 2023 and for closed products from 31 July 2024) is the central retail-conduct regime the FCA now uses to grade firm behaviour, and the failure modes seen here all land inside the day-to-day work product that financial advisers sign off on.

For financial advisers, the operational consequence is direct. A suitability record or client-facing fair-value rationale built on the AI's framing imports a defect into the advice file. A thematic review, a complaint to the Financial Ombudsman Service, or a follow-up supervision visit will surface the gap, and the adviser carries the regulatory exposure.

Citation IDs for the findings in this brief: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q003-Opus47, RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q007-Sonnet46, RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47, RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46, RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47. Each citation links to the per-finding record, the AI subject answer, and the regulator-issued substrate excerpt the answer was tested against. The RLB Specialist Panel maintains an audit-traceable record of which model produced which answer, against which substrate passage, and the binding is what makes the finding referenceable in firm work product and in supervisory correspondence.

The findings below are the ones that financial advisers working under the Consumer Duty are most likely to encounter in the AI tools they already use, and the briefing sections that follow read each finding against the regulator-issued text.

<- Take me back to my Financial Advisers (UK) overview

Executive Summary

Financial Advisers responsible for retail customer outcomes work daily inside the Consumer Duty framework: customer-journey design, fair-value assessments, disclosure templates, and product-governance reviews all turn on the FCA's actual text. Across five findings in this cell, frontier AI models reversed FG22/5's qualitative-only fair-value methodology into an affirmative quantification standard (in both Opus and Sonnet variants), confused FG22/5 guidance with binding PRIN 2A.5 rules on consumer testing, added conditions to the foreseeable-harm safe harbour at PRIN 2A.2, and reversed the express PRIN 2A.1.8R scope exclusion for group insurance distribution.

Each of these maps directly onto the kinds of templates, reviews, and product specs that financial advisers produce, and each carries real over-build cost when relied on without source-text verification.

How AI gets this regulation wrong

The findings in this cell are inference drift and rule-misstatement. The models tested committed to specific operational answers where the FCA's actual text either holds a single-test standard (PRIN 2A.2), a qualitative-only methodology (FG22/5 fair-value), or an express scope exclusion (PRIN 2A.1.8R). The errors are particularly damaging because each one would push a financial adviser to build a more demanding compliance programme than the FCA has asked for.

AI's Failure ModeCountAffected findings
Inference Drift1Finding#1
Inference Drift1Finding#2
Inference Drift1Finding#3
Inference Drift1Finding#4
Misstated Rule1Finding#5

What that means for your practice

For Financial Advisers, the five findings cluster on operational over-build risk: customer-warning standards built above the rule, fair-value templates built to a quantification bar the FCA has not set, and scope decisions that pull excluded activities into the Duty's compliance programme. Each represents real cost, additional template review cycles, additional fair-value documentation, additional product-governance steps, with no offsetting compliance benefit because the underlying rule does not require any of it.

Risk ImpactCountAffected findings
Regulatory enforcement / professional liability exposure5Finding#1 · Finding#2 · Finding#3 · Finding#4 · Finding#5

When this affects Financial Advisers

Financial Advisers responsible for retail customer outcomes encounter the Consumer Duty across product-governance reviews, customer-journey design, fair-value assessment templates, disclosure design, and ongoing customer-monitoring frameworks. Each of these workstreams turns on the FCA's published text in PRIN 2A and FG22/5, and each is increasingly supported by AI-assisted research at the framing stage.

The findings in this cell map onto the most operationally consequential questions a financial adviser receives. The foreseeable-harm safe harbour (Finding#1) underpins customer-warning templates across every retail product line, and the model's multi-factor reconstruction would build a defensive standard the rule does not require. The PRIN 2A.5 versus FG22/5 confusion (Finding#2) is the most expensive: the model treats consumer testing as a binding rule under PRIN 2A.5.10R when FG22/5 guidance recommends it, and the over-build cost of running mandatory testing programmes runs into real adviser time and cost.

The fair-value inversion (Findings#3 and 4) is the most pervasive: both Opus and Sonnet, in different surface forms, push the firm to quantify non-monetary costs and benefits when FG22/5 expressly says qualitative is sufficient. The group insurance reversal (Finding#5) affects financial advisers in the insurance distribution chain who would otherwise apply the Duty to activities the FCA has excluded.

Each of these is a question a financial adviser is most likely to ask an AI tool when scoping a new product approval or a fair-value review cycle, and each is a question where the testing showed the AI's answer is wrong in a way that consistently elevates the apparent compliance burden.

The findings at a glance

The table below lists each finding from the Consumer Duty testing in this cell, showing the question area, the AI's failure mode, and the citation identifier for the underlying finding record.

#Finding titleTypeCitation ID
1Fabricated multi-part safe harbour for foreseeable-harm ruleHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q003-Opus47
2Confused FG22/5 guidance with PRIN 2A.5 rule on consumer testingHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q007-Sonnet46
3Inverted FG22/5 on fair-value quantification for non-monetary benefitsHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47
4Imposed substantiated-comparison expectation FG22/5 does not requireHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46
5Reversed the PRIN 2A scope exclusion for group insurance distributionHallucinationRLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47

Aggregate impact

The five findings in this cell describe a coherent pattern of model behaviour on retail-conduct rules: the models tested consistently produced answers that elevated the apparent compliance burden above the FCA's actual standard. The foreseeable-harm safe harbour got an extra three conditions, the consumer-understanding outcome got a binding-rule consumer-testing requirement it does not have, the fair-value assessment got a quantification expectation FG22/5 expressly disclaims, and the group insurance distribution chain got pulled into a scope the FCA has expressly excluded.

For financial advisers, this pattern is structurally dangerous because the over-build is invisible at the point of adoption. A fair-value template that asks for quantitative non-monetary analysis looks more rigorous, not less; a customer-warning programme that runs through multiple compliance conditions looks more careful, not less; a Duty-compliant programme that covers group insurance distribution looks more comprehensive, not less. Each of these increases cost and adviser time without any underlying regulatory requirement.

The implication is that AI-assisted research on retail-conduct workstreams cannot be relied on for methodology decisions, scope determinations, or rule-versus-guidance distinctions. Each of these is a question type that produces wrong answers in the direction of more work for the financial adviser, not less.

What your team should do

Financial Advisers should treat AI tools as a starting point for Consumer Duty workstream design, not as a source of methodology or scope decisions. Any output that names a fair-value standard, characterises a rule-versus-guidance distinction, or recites a scope position requires direct verification against the FCA Handbook or FG22/5 before it can be built into a template, review process, or product spec.

For practical safeguards on retail-product work: (a) when an AI tool describes the FCA's expectation on fair-value methodology, pull the FG22/5 passage on qualitative versus quantitative assessment before building or reviewing a template. (b) When an AI tool characterises a PRIN 2A provision as binding, confirm the rule-versus-guidance status by reading the provision directly in the FCA Handbook; the typographic distinction (R, G, E) is the operative signal. (c) When an AI tool gives a scope position on a particular product line, group insurance, large-risk commercial contracts, reinsurance, confirm against PRIN 2A.1.8R before adjusting the product-governance programme.

Where AI tools are most safely used in this practice area: framing customer-journey design, identifying which Duty workstreams are likely relevant to a new product line, drafting customer-facing summaries for review against the source text, and surfacing cross-references between Duty workstreams and adjacent FCA expectations. The risk concentrates in the next step, where the AI is asked to specify the actual methodology, the applicable scope, or the rule-versus-guidance status. At that point the source document is the only reliable input.

How RLB Can Help

RegLeg's published Hallucination Research gives UK Financial Advisers a free pre-flight check before relying on AI tools for Consumer Duty workstream design. Before an AI-assisted template, review process, or product spec is finalised, the research identifies which areas of the Duty, the foreseeable-harm safe harbour, the rule-versus-guidance boundary on consumer testing, the fair-value methodology in FG22/5, and the scope exclusions in PRIN 2A.1.8R, have historically generated confident but incorrect AI output that would push the firm to over-build its compliance programme.

Beyond the published research, RegLeg works with UK retail-facing firms on bespoke regulator deep-dives that map AI-supported workflows within financial-adviser practice to their actual hallucination exposure. The deep-dive identifies which workstreams (customer-warning templates, fair-value assessments, product-governance reviews, distribution-chain scope decisions) warrant additional controls or independent verification steps. RegLeg also conducts a confidential review of the firm's existing AI-use policy against the failure-mode catalogue, delivering a prioritised remediation plan.

For teams that want to build durable in-house capability, RegLeg develops training material and CPD-aligned content tailored to the UK Financial Adviser context, covering how to interpret AI-generated regulatory summaries critically and how to document AI-assisted decision-making consistently with FCA conduct standards.

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.