AI Hallucination on Consumer Duty for Stockbrokers / Trading Reps in the United Kingdom

<- Take me back to my Stockbrokers / Trading Reps (UK) overview

Executive Summary

Stockbrokers and Trading Representatives executing retail orders operate inside the foreseeable-harm provision at PRIN 2A.2 and the FCA's supervisory-letter landscape that frames trading-desk policy. Across two findings in this cell, frontier AI models tested with web search reconstructed PRIN 2A.2's single-test safe harbour as a multi-factor compliance check, and fabricated an April and August 2025 timeline for FS25/2 Dear CEO letter withdrawals that the regulator never issued. Both errors translate into operational consequences for a retail-facing trading desk.

How AI gets this regulation wrong

The two findings are inference drift on operational provisions. The model substituted a multi-condition test for PRIN 2A.2's single reasonable-belief standard, and fabricated specific April and August 2025 dates for an FS25/2 action that actually occurred in March 2025.

AI's Failure Mode	Count	Affected findings
Inference Drift	1	Finding#1
Inference Drift	1	Finding#2

What that means for your practice

For retail-facing stockbrokers and trading representatives, both findings have direct operational consequences: a customer-warning template built on the AI's multi-factor safe harbour goes beyond what the rule requires, and a desk-level compliance briefing that records the AI's invented FS25/2 timeline misrepresents the live supervisory record.

Risk Impact	Count	Affected findings
Regulatory enforcement / professional liability exposure	1	Finding#1
Operational decisions based on a fabricated regulator record	1	Finding#2

When this affects Stockbrokers / Trading Reps

Stockbrokers and Trading Representatives executing for retail clients encounter the Consumer Duty most often at the customer-warning and order-handling layers. The PRIN 2A.2 foreseeable-harm provision sits at the centre of any desk policy that asks whether a particular order, product, or customer category triggers a warning or escalation, and the FCA's supervisory-letter record frames how desk-level compliance teams keep current with the regulator's expectations.

The findings in this cell map onto two of the most operationally consequential question types for a trading-desk reader. The foreseeable-harm safe harbour (Finding#1) is the rule a trader needs to know in plain form: a single test, does the firm reasonably believe the customer understands and accepts the risk. The model's multi-factor reconstruction, if imported into a desk policy or warning template, would set a defensive standard the rule does not require, adding customer friction without a regulatory benefit.

The FS25/2 finding (Finding#2) affects how the desk keeps current with the supervisory record: a compliance briefing that records the AI's fabricated April and August 2025 timeline misrepresents what the FCA actually published in March 2025.

The findings at a glance

The table below lists each finding from the Consumer Duty testing in this cell, showing the question area, the AI's failure mode, and the citation identifier for the underlying finding record.

#	Finding title	Type	Citation ID
1	Fabricated multi-part safe harbour for foreseeable-harm rule	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q003-Opus47
2	Split FS25/2 single-event withdrawal into invented April/August 2025 events	Hallucination	RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q013-Opus47

Aggregate impact

The two findings cluster on operational risk for retail-facing stockbrokers and trading representatives. The PRIN 2A.2 reconstruction would, if relied on, push the desk to build a more defensive customer-warning standard than the rule requires; the FS25/2 fabrication would, if relied on, give the desk the wrong understanding of the live supervisory-letter landscape. Neither error has any compliance benefit, and both add cost or distortion.

The implication for desk-level practice is that AI-assisted research on the Consumer Duty cannot be relied on for safe-harbour specification or supervisory-record summary. Both question types produce confident wrong answers in the testing, and both are question types a trading-desk reader is most likely to ask.

What your team should do

Stockbrokers and Trading Representatives should treat AI tools as a starting point for Duty-related desk-policy framing, not as a source of safe-harbour specification or supervisory-record summary. Any output that describes the PRIN 2A.2 foreseeable-harm test, or recites figures and dates from FCA feedback statements, requires direct verification against the FCA Handbook or the published feedback statement before it can be built into a desk policy or compliance briefing.

For practical safeguards on desk policy: (a) when an AI tool describes the conditions for the foreseeable-harm safe harbour, pull the PRIN 2A.2 paragraph from the FCA Handbook before relying on the wording; the rule is short and the conditions are explicit. (b) When an AI tool gives dates for an FCA Dear CEO letter withdrawal or supervisory action, confirm against FS25/2 or the underlying feedback statement before relying on the timeline in a compliance briefing.

Where AI tools are most safely used: framing the structure of a desk policy or customer-warning template, surfacing cross-references between Duty obligations and adjacent FCA expectations on retail conduct, and producing first-draft summaries for review against the source text. The risk concentrates in the rule-specification and supervisory-record steps.

How RLB Can Help

RegLeg's published Hallucination Research gives UK stockbrokers and trading representatives a free pre-flight check before relying on AI tools for Consumer Duty desk work. The research surfaces which questions about the foreseeable-harm safe harbour and the FCA's supervisory-letter record have generated confident but incorrect AI output.

Beyond the published research, RegLeg works with retail-facing trading desks on bespoke deep-dives mapping AI-supported desk workflows to their actual hallucination exposure, and develops CPD-aligned material covering how to interpret AI-generated regulatory summaries critically in a desk context.