Opus unravels the geometry of AI mistakes in Consumer Duty for premium-listed issuer legal teams.
— RLB Specialist Panel
Misstated Statutory Architecture, Reversed the PRIN 2A Group-Insurance Exclusion: Consumer Duty (PS22/9 + PRIN 2A) under audit.
Two frontier AI models tested by the RLB Specialist Panel produced 2 substantive failures on the Consumer Duty, with material implications for the work product of mainboard-issuer legal teams.
Frontier AI models, asked questions of the kind mainboard-issuer legal teams put to them on the Consumer Duty in real workflows, produce confident answers that drift from the regulator's actual position on Principle 12, PRIN 2A, and the FCA's Feedback Statement record. The failure classes seen are: Misstated Statutory Architecture, Reversed the PRIN 2A Group-Insurance Exclusion.
Questions were prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for. Each question is paired with verbatim regulator-issued source text held as primary substrate, against which the AI subject answer is graded. Two frontier AI models were the subjects under test on this regulation. The panel binds each finding to the substrate excerpt it tests against; the binding is what makes each finding referenceable and audit-traceable.
Citation: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q002-Sonnet46. Model under test: Claude Sonnet 4.6. Failure mode: misstated rule.
Question put to the model: Whether the Consumer Duty was created through primary legislation or FCA rulemaking, the legal basis for Principle 12 and PRIN 2A, and whether FSMA 2023 had any role in creating it.
What the model answered: The model stated the Consumer Duty was introduced by FCA rules with the legal basis sitting under FSMA 2000, omitting any role for FSMA 2023.
Regulator-issued position (verbatim): "FSMA 2023 did not create the Consumer Duty."
Reading: The regulator-issued text confirms FSMA 2023 did not create the Consumer Duty. The model's silence on FSMA 2023 is technically the right outcome but reached without engaging the question, leaving the user without confidence that the model actually considered the post-Brexit statute.
Citation: RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47. Model under test: Claude Opus 4.7. Failure mode: misstated rule.
Question put to the model: Whether the Consumer Duty applies to reinsurance, group insurance policy distribution, and large-risk commercial contracts.
What the model answered: The model stated the Duty applies via the distribution chain when group policy beneficiaries are retail customers, citing a putative further consultation (CP23/something) confirming in-scope status.
Regulator-issued position (verbatim): "Consumer Duty does not apply to reinsurance, contracts of large risk sold to commercial customers where risk is located outside the UK, nor to activities connected to the distribution of group insurance policies or the extension of these policies to new members."
Reading: The PRIN 2A scope exclusion expressly carves out group insurance distribution. The model's reversal would push firms into in-scope treatment of activities the regulator has placed out of scope, and the fabricated consultation reference compounds the error.
For mainboard / premium-listed issuer legal teams, the operational consequence is direct. Prospectus drafting, listing-rule compliance memos, board legal opinions, and director attestations on Consumer Duty applicability all rest on accurate PRIN 2A scope framing and statutory-architecture citation. A defect imported from AI work product surfaces on listing-rule review or audit-committee challenge, and the in-house function carries the professional exposure.
The failures recorded here are not stylistic. Each one would, if relied on, shift the firm's documented position on a specific Consumer Duty obligation: scope of application, foreseeable-harm safe harbour, fair-value methodology, or the current status of pre-Consumer Duty supervisory expectations. The work product of mainboard-issuer legal teams sits between the firm and the regulator, and it has to track the rule as written.
On the question of fsma 2023 role in the consumer duty: "FSMA 2023 did not create the Consumer Duty." (source: regulator-issued primary substrate held by the RLB Specialist Panel; citation RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q002-Sonnet46).
On the question of consumer duty scope on reinsurance, group insurance, and large-risk commercial contracts: "Consumer Duty does not apply to reinsurance, contracts of large risk sold to commercial customers where risk is located outside the UK, nor to activities connected to the distribution of group insurance policies or the extension of these policies to new members." (source: regulator-issued primary substrate held by the RLB Specialist Panel; citation RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47).
Frontier AI models are useful drafting partners for mainboard-issuer legal teams, but they are not a substitute for the rule text. The failure patterns recorded on the Consumer Duty cluster around three lenses. First, scope drift, in which the model misstates what the rule covers, illustrated here by the reversed group-insurance exclusion under PRIN 2A and the silent omission of FSMA 2023 from the statutory architecture answer.
Second, methodology drift, in which the model elevates guidance (FG22/5) to rule status (PRIN 2A) or imports a stricter expectation than the regulator sets, illustrated by the non-monetary quantification framing the FCA expressly disavowed. Third, evidence-avoidance, in which the model refuses to commit on a question that the regulator has answered in plain text in a documented Feedback Statement, illustrated here by the FS25/2 Dear CEO letter retirement count.
For mainboard-issuer legal teams, the practical reading is: AI output on the Consumer Duty needs to be checked against verbatim substrate (PRIN 2A, PS22/9, FG22/5, FS25/2) before it lands in a work product the firm or the regulator will rely on. The model's confidence is not a reliable signal of accuracy on this regulation, because the failures recorded are confident-wrong, not hesitant-wrong.
The Consumer Duty is a regime the FCA grades in writing through Feedback Statements, supervisory correspondence, and thematic-review outputs, and the rule text plus the FCA's documented Feedback record together are what an audit-traceable AI use of the regulation has to sit on.
The RLB Specialist Panel runs structured audits of frontier AI models against high-stakes regulator-issued texts. Each finding on the Consumer Duty is bound to the regulator-issued source it tests against, recorded with the AI subject's answer in full, and held with the verbatim regulator-issued passage that grades the answer. The Panel offers AI labs a partnership channel: regulator-substrate-bound finding sets, audience-tagged per profession and per sector-department, that feed into model evaluation pipelines and post-training reinforcement signals.
Firms with a stake in the work product (legal, compliance, risk, product, board secretariat) get a referenceable finding library on the regulations they live with. Every finding on this regulation is held with the question, the AI subject's full answer, the regulator-issued passage that grades the answer, and the model identity, so the record can be audited end-to-end and the binding can be re-verified at any time.
Mainboard / premium-listed issuer in-house legal teams should, on every AI-drafted Consumer Duty work product, take the following discipline:
These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.
Primary source verified: FCA PS22/9 + PRIN 2A + FG22/5 · Substrate documents: R2-REGULATION-PS22_9_full_policy_statement.pdf, R3-GUIDELINE-Q17_consumer_duty_focus_areas.pdf · FCA portal: fca.org.uk
Citation IDs referenced:
RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q002-Sonnet46RLB-H-GB-FCA-CONSUMER-DUTY-PS22-9-Q018-Opus47