AI Hallucination on Consumer Duty for Risk teams at Payment Institutions firms in the United Kingdom

<- Take me back to my Risk x Payment Institutions (UK) overview

Executive Summary

The FCA Consumer Duty (PS22/9 and PRIN 2A, with FG22/5 guidance) is the UK retail conduct framework that frames day-to-day work for Risk teams at Payment Institutions firms. Across the 3 findings in this cell, frontier AI models tested with web search produced confidently wrong reconstructions of the FCA's text in ways that bear directly on Risk workstreams at Payment Institutions firms. Each error converts into either an over-build cost (defensive controls or templates the rule does not require) or a supervisory-record misstatement that surfaces on review.

None of the errors deliver any compliance benefit; all of them add operational cost or expose the team to challenge.

How AI gets this regulation wrong

The findings in this cell are inference drift and rule-misstatement, not refusal. The models committed to specific operational answers where the FCA's actual text would have resolved the question differently. For Risk teams at Payment Institutions firms, the consequence is that AI-assisted summaries of the FCA's published positions cannot be relied on without source-text verification.

AI's Failure Mode	Count	Affected findings
Inference Drift	1	Finding#1
Inference Drift	1	Finding#2
Inference Drift	1	Finding#3

What that means for your team

For Risk teams at Payment Institutions firms, the findings cluster on the same risk category: regulatory enforcement exposure where the FCA's text resolves the question differently, paired with the operational cost of building controls or analytical work the rule does not require. The audit trail of the team's regulatory engagement becomes the durable record, and importing AI-fabricated reconstructions into that record undermines the team's ability to respond to a supervisory or internal-audit challenge.

Risk Impact	Count	Affected findings
Regulatory enforcement / professional liability exposure	3	Finding#1 · Finding#2 · Finding#3

When this affects your department

Risk teams at Payment Institutions firms encounter the Consumer Duty across the team's core workstreams. The Payment Institutions business model brings retail customers and the Duty's product-governance, fair-value, and consumer-understanding obligations into the team's daily work, and the team increasingly uses AI tools to surface FCA requirements at the framing and drafting stages.

The findings in this cell map onto the most operationally consequential question types for this audience. Where the AI is asked about a binding rule, an FCA scope position, a methodology expectation, or a recent supervisory action, the models tested produce confident wrong answers. The error patterns are consistent between Opus 4.7 and Sonnet 4.6, suggesting structural failure modes rather than model-specific slips.

The findings at a glance

The table below summarises each finding from our testing on the Consumer Duty for this audience, including the question area tested, the type of AI failure observed, and the risk category that failure creates for Risk teams at Payment Institutions firms.

#	Finding title	Type	Citation ID
1	Fabricated multi-part safe harbour for foreseeable-harm rule	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q003-Opus47
2	Inverted FG22/5 on fair-value quantification for non-monetary benefits	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Opus47
3	Imposed substantiated-comparison expectation FG22/5 does not require	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q008-Sonnet46

Aggregate impact

For Risk teams at Payment Institutions firms, the findings show a coherent pattern: AI tools produce confident, operationally consequential answers on Consumer Duty questions that the FCA's published text directly contradicts. The pattern holds across both models tested and across the cross-cutting rules, the four-outcomes structure, the scope exclusions, the fair-value methodology, and the FCA's recent supervisory-letter record.

The implication for the team's AI-use posture is structural rather than tactical. Any AI-assisted summary that names a specific PRIN 2A provision, characterises an FCA scope position, or recites figures from a feedback statement requires direct source verification before it can be built into a template, brief, or control framework. The verification cost is real but the over-build cost of relying on the AI's framing is larger.

What your team should do

Risk teams at Payment Institutions firms should treat AI tools as a starting point for Consumer Duty research, not as a source of FCA text. Any output that quotes a PRIN 2A provision, describes the FCA's scope position, or recites figures from a feedback statement requires direct verification against the FCA Handbook or the published feedback statement before it can be transmitted to a colleague or included in a deliverable. The findings in this cell show that the verification cost is not theoretical.

For practical safeguards on Consumer Duty work: (a) pull the underlying PRIN 2A paragraph from the FCA Handbook before relying on an AI tool's characterisation of a rule. (b) Confirm any AI-supplied figure or date from an FCA publication against the underlying PDF before it appears in a deliverable. (c) Build into the team's AI-use practice a specific carve-out for scope and methodology questions: these are precisely the question types where this testing shows AI tools produce confident wrong answers.

Where AI tools are most safely used in this practice area: framing the structure of a Duty-related deliverable, identifying which Duty workstreams are likely relevant to a particular product line, drafting first-draft summaries for review against the source text, and surfacing cross-references between Duty obligations and adjacent FCA expectations. The risk concentrates in the rule-specification, methodology, and supervisory-record steps. At that point the source document is the only reliable input.

How RLB Can Help

RegLeg's published Hallucination Research is available as a free pre-flight check for Risk teams at Payment Institutions firms operating across UK conduct supervision. Before relying on AI-assisted output for Consumer Duty interpretation, the research identifies precisely which areas of the Duty's text have historically generated confident but incorrect AI output, letting the team apply targeted scrutiny.

RegLeg also works with UK firms on bespoke regulator deep-dives that map AI-supported workflows in the Risk function at Payment Institutions firms to their actual hallucination exposure, and conducts confidential reviews of the firm's existing AI-use policy against the failure-mode catalogue. For teams building durable in-house capability, RegLeg develops training material tailored to the Risk x Payment Institutions context.