AI Hallucination on Consumer Duty for Lawyers in the United Kingdom — AI Hallucination Research

<- Take me back to my Lawyers (UK) overview

Executive Summary

The Consumer Duty (PS22/9 and PRIN 2A, with FG22/5 guidance) is the operational core of the UK retail conduct framework, and Consumer Duty advice work sits squarely in the practice of every UK conduct lawyer serving FCA-authorised firms.

Across eight findings in this cell, frontier AI models tested with web search produced confident reconstructions of the Duty that the FCA's published text directly contradicts: a multi-factor compliance test substituted for a single-condition safe harbour at PRIN 2A.2, a binding PRIN 2A.5.10R citation imposed where FG22/5 guidance lives, a reversed group insurance scope exclusion paired with a fabricated 'CP23/something' consultation paper, an invented April/August 2025 timeline for FS25/2 withdrawals reproduced across multiple questions, and an evasion response paired with a fabricated Clifford Chance citation.

Each of these maps directly onto the kinds of memos, opinions, and supervisor-facing correspondence that conduct lawyers produce daily, and each carries direct PI exposure when relied on without source-text verification.

How AI gets this regulation wrong

The findings in this cell are inference drift and rule-misstatement, not refusal. The models tested committed to specific answers where the correct posture would have been to surface the actual PRIN 2A or FG22/5 passage, or to flag that the model could not locate the regulator's text. Instead, both models generated content with the surface features of conduct-regulatory analysis, defined-term usage, rule citations, methodology language, while the underlying claims were either fabricated or directly contradicted by the FCA's published text.

AI's Failure Mode	Count	Affected findings
Misstated Rule	1	Finding#1
Inference Drift	1	Finding#2
Inference Drift	1	Finding#3
Inference Drift	1	Finding#4
Inference Drift	1	Finding#5
Misstated Rule	1	Finding#6
Inference Drift	1	Finding#7
Inference Drift	1	Finding#8

What that means for your practice

For UK conduct lawyers, all eight findings cluster on the same risk category: professional indemnity exposure when advice misstates the regulatory position. The group insurance scope reversal (Finding#6) is the most acute: a memo that brings excluded activities into the Duty's perimeter on the strength of a fabricated consultation paper number is professionally indefensible. The FSMA basis (Finding#1) and the PRIN 2A.5 versus FG22/5 confusion (Finding#3) are subtler but no less damaging in a board-paper or opinion context. The FS25/2 fabrications (Findings#6, 7, 9, 10) are the easiest to detect on review but the easiest to import in a hurry.

Risk Impact	Count	Affected findings
Regulatory enforcement / professional liability exposure	4	Finding#1 · Finding#2 · Finding#3 · Finding#6
Operational decisions based on a fabricated regulator record	4	Finding#4 · Finding#5 · Finding#7 · Finding#8

When this affects Lawyers

UK conduct lawyers encounter the Consumer Duty across an enormous range of engagements: drafting board reports under PRIN 2A.8.3R, advising on the foreseeable-harm provision in customer-warning templates, opining on Duty scope across distribution chains, supporting FCA enforcement defence, advising on the fair-value methodology in product-governance reviews, and tracking the FCA's supervisory-letter landscape (Dear CEO letters, multi-firm reports, feedback statements). Each of these mandates puts the lawyer in a position of stating which provision applies, what it requires, and how an amendment changes the consolidated text.

The findings in this cell map onto the most common questions a UK conduct lawyer receives. The FSMA-basis question (Finding#1) is the kind of preliminary authority cited in board papers and supervisor-facing memos. The foreseeable-harm safe harbour (Finding#2) underpins almost every customer-warning template, and a lawyer who imports the model's multi-condition reconstruction will build a defensive standard the rule does not require. The PRIN 2A.5 versus FG22/5 confusion (Finding#3) is the cleanest example of a sourcing error: the model treats guidance as rule, and the resulting opinion overstates what the firm must do.

The group insurance reversal (Finding#6) is the most dangerous: a scoping memo built on the AI's reversal of PRIN 2A.1.8R would expose the firm to over-application of the Duty, and the fabricated 'CP23/something' consultation paper is the kind of citation that survives a quick reviewer's look but fails any reference check.

The FS25/2 findings (Findings#6, 7, 9, 10) reveal an even more structural problem. The same fabricated April and August 2025 timeline appears under multiple differently framed questions, and the Sonnet 4.6 model combines an evasion response on a published figure with a fabricated Clifford Chance citation. For a lawyer keeping current on the FCA's supervisory record, neither AI's account is usable; the only safe source is FS25/2 itself.

The findings at a glance

The table below lists each finding from the Consumer Duty testing in this cell, showing the question area, the AI's failure mode, and the citation identifier for the underlying finding record.

#	Finding title	Type	Citation ID
1	Misstated FSMA 2023 role in creating the Consumer Duty	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q002-Sonnet46
2	Fabricated multi-part safe harbour for foreseeable-harm rule	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q003-Opus47
3	Confused FG22/5 guidance with PRIN 2A.5 rule on consumer testing	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q007-Sonnet46
4	Split FS25/2 single-event withdrawal into invented April/August 2025 events	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q013-Opus47
5	Declined to disclose a verified FS25/2 figure the regulator published	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q013-Sonnet46
6	Reversed the PRIN 2A scope exclusion for group insurance distribution	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47
7	Repeated FS25/2 fabricated April/August 2025 timeline across a second question	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q020-Opus47
8	Combined evasion with a fabricated Clifford Chance citation on Dear CEO letters	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q020-Sonnet46

Aggregate impact

The eight findings in this cell describe a specific failure mode that conduct lawyers should expect when AI tools are used for Consumer Duty research. Both models tested were willing to commit, with no hedging, to fabricated regulatory tests (the multi-condition safe harbour at PRIN 2A.2), reversed scope exclusions (the group insurance carve-out at PRIN 2A.1.8R), and invented supervisory timelines (the April/August 2025 FS25/2 events). In every case the published FCA text resolves the question; in every case the model produced a confident answer that the published text contradicts.

The pattern points to a generation behaviour that lawyers should treat as a near-certain failure mode in this domain. When the FCA documents a provision that turns on a single test (reasonable belief), the model is liable to reconstruct it as a composite test that looks more 'comprehensive' to a non-specialist reader. When the FCA documents a scope exclusion, the model is liable to override it with a fabricated authority. When the FCA documents a specific factual record (FS25/2's 90+ Dear CEO letter withdrawals), the model is liable to either fabricate dates around it or decline to disclose it.

For practising conduct lawyers, the implication is that AI-assisted research on the Consumer Duty cannot be relied on for rule reconstruction, scope determination, or factual recital of the FCA's supervisory record. Each of these is a question type that the AI handles in a confident, fluent register, and each of these is a question type where the testing showed the AI was wrong in ways the regulator's own text resolves.

What your team should do

UK conduct legal teams should treat AI tools as a search-prompt generator on Consumer Duty questions, not a source of regulatory text. Any output that names a specific PRIN 2A provision, characterises the FCA's scope position, or recites figures from a feedback statement requires direct verification against the FCA Handbook or the published feedback statement before it can be transmitted to a client.

The findings in this cell show that the verification cost is not theoretical: a confidently asserted multi-condition safe harbour, a confidently reversed scope exclusion, and a confidently fabricated supervisory timeline were all shown by direct reference to the regulator's text to be wrong.

For practical safeguards on Consumer Duty work: (a) when an AI tool quotes a PRIN 2A provision, pull the corresponding paragraph from the FCA Handbook before relying on the wording; the Handbook is publicly available. (b) When an AI tool names a consultation paper (CP), feedback statement (FS), or finalised guidance number, treat the citation as a suggested search term, not a verified reference; placeholder patterns like 'CP23/something' are a strong tell.

(c) When an AI tool gives a specific figure or date from an FCA publication, confirm it against the underlying PDF before it appears in any advice or board paper. (d) Build into the firm's AI-use policy a specific carve-out for Consumer Duty scope and fair-value methodology questions: these are precisely the question types where this testing shows the AI produces confident wrong answers.

Where AI tools are most safely used in this practice area: framing the structure of a Consumer Duty memo, identifying which PRIN 2A sub-chapters are likely relevant to a particular product line, drafting client-facing summaries of regulatory architecture for review against the source text, and surfacing cross-references between the Duty and adjacent FCA instruments. The risk concentrates in the next step, where the AI is asked to specify the actual rule text, the applicable scope position, or the meaning of a regulator-specific publication. At that point the source document is the only reliable input.

How RLB Can Help

RegLeg's published Hallucination Research gives UK conduct legal teams a free pre-flight check before relying on AI tools for Consumer Duty research. Before an AI-assisted opinion, board paper, or scoping memo is finalised, the research identifies precisely which areas of the Consumer Duty text, the foreseeable-harm safe harbour, the rule-versus-guidance boundary in PRIN 2A.5, the scope exclusions in PRIN 2A.1.8R, the fair-value methodology in FG22/5, and recent FCA feedback statements, have historically generated confident but incorrect AI output. That forewarning lets the team apply targeted scrutiny rather than blanket scepticism, making AI assistance genuinely efficient without importing undetected professional liability.

Beyond the published research, RegLeg works with UK firms on bespoke regulator deep-dives that map AI-supported workflows within conduct-lawyer practice to their actual hallucination exposure. Activities such as drafting customer-warning templates, opining on Duty scope across distribution chains, supporting fair-value reviews, or summarising the FCA's supervisory-letter landscape carry different risk profiles, and the deep-dive surfaces which ones warrant additional controls or independent verification steps.

RegLeg also conducts a confidential review of the firm's existing AI-use policy against the failure-mode catalogue, delivering a prioritised remediation plan that distinguishes low-risk efficiency gains from higher-risk applications where AI output should be treated as a first draft only.

For teams that want to build durable in-house capability, RegLeg develops training material and CPD-aligned content tailored to the UK conduct-lawyer context. This covers how to interpret AI-generated regulatory summaries critically, how to structure escalation where AI confidence is high but human verification is essential, and how to document AI-assisted decision-making in a manner consistent with FCA conduct standards. The material can be delivered as standalone workshops or integrated into the firm's existing CPD calendar.