AI Hallucination ResearchBriefings › Briefing
Sector × Dept INT BIS-CPMI
Investment Banking Operations teams · Principles for Financial Market Infrastructures (PFMI)

By Kratti A Agrawal, Lead, RegLeg Brief Specialist Panel

Investment Banking Operations teams: documentation and reporting gaps possible from AI reading of PFMI (Principles for Financial Market Infrastructures)

Claude spots the hidden fault lines in PFMI 2012 investment banking operational doctrine.

— RLB Specialist Panel

Frontier AI models invented and misattributed PFMI Key Considerations, regulatory-research panel finds

A frontier AI model tested by the RegLeg Brief Specialist Panel documented Annex F's regulator-to-CSP oversight scope as FMI-internal only, framing the PFMI's parallel supervisor channel out of existence.

SINGAPORE. Two frontier artificial-intelligence models generated structurally confident but textually wrong reconstructions of the CPMI-IOSCO Principles for Financial Market Infrastructures (PFMI, 2012), the global standard for systemically important payment systems, central counterparties, and securities settlement infrastructures, according to findings documented by the RegLeg Brief Specialist Panel, the regulatory-research function operated by Singapore-incorporated Verdus Technologies Pte. Ltd. The verbatim regulator-issued source text is held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf).

The pattern in one line

For Operations teams at Investment Banking firms, the pattern is this: the model gets the governance topic substantively in range, then anchors the position to a named PFMI Key Consideration or annex that does not support it. The substantive paraphrase reads as competent regulatory analysis; the citation underneath it does not check out against the published PFMI text. The reader cannot see the failure without opening the PFMI document and reading the cited Key Consideration.

How the RLB Specialist Panel tested this

Questions are prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for. The Panel binds each AI finding to verbatim regulator-issued source text held as primary substrate. For PFMI, the Panel ran two question shapes against frontier AI subjects: Specialist Panel direct questions on what the PFMI text states about CSP oversight and on what Principle 2 Key Consideration 6 requires, and Specialist Panel application-style questions framed in the voice of a practitioner preparing a disclosure-framework response or a board terms-of-reference draft.

Two frontier AI subjects were tested with web search enabled. The Panel does not paraphrase model output; the exact strings the models produced are matched against the regulator-issued primary substrate, and only those answers that diverge from the substrate text are recorded as confirmed hallucinations.

What the models got wrong

Finding RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46: Annex F supervisor-scope inversion. Claude Sonnet 4.6, with web search enabled, was asked what oversight expectations Annex F establishes for critical service providers, and whether regulators can direct those expectations at CSPs independently of the FMI. The model answered, verbatim, that "under the PFMI framework, authorities do not directly supervise or oversee CSPs. The responsibility for ensuring CSP compliance with Annex F expectations rests with the FMI itself; the FMI is expected to contractually require and monitor its CSPs' adherence.

Annex F is framed as expectations that flow from the FMI to its CSPs." The PFMI text contradicts that framing directly. Annex F opens: "A regulator, supervisor, or overseer of an FMI may want to establish expectations for an FMI's critical service providers in order to support the FMI's overall safety and efficiency. The expectations outlined below are specifically targeted at critical service providers." The PFMI text contemplates a parallel regulator-to-CSP oversight channel; the model's output frames that channel out of existence and recasts CSP oversight as FMI-internal only.

Why this matters for Operations teams at Investment Banking firms

Operations teams at investment banks owning the live operating model for CSP relationships under an FMI mandate are the population most exposed when AI output frames supervisor reach as ending at the FMI boundary, because the runbook will not anticipate a regulator-driven inquiry that contacts the CSP directly under Annex F.

For this audience, the work product is CSP-relationship governance procedures, incident-response and escalation runbooks, third-party performance monitoring schedules, and supervisor-engagement protocols for outsourced services. Every item on that list travels under the firm's name to a reader who can locate the cited Key Consideration or annex in the published PFMI document and read it. The failure pattern is not recoverable at the desktop because the model's output reads as a competent governance paragraph: it uses defined terms correctly (FMI, KC, CSP, Annex F), it tracks the PFMI's structural vocabulary, and it states a position the reader expects to read.

The error surfaces only when the reader opens the PFMI and locates the cited Key Consideration, at which point Operations teams' draft is exposed as having attributed a position to the regulator's text that the text does not contain.

The regulator's actual position

The verbatim regulator-issued source text held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf) supports the following positions, each of which contradicts the corresponding AI output.

On Annex F supervisor scope. The published PFMI Annex F opening text reads: "A regulator, supervisor, or overseer of an FMI may want to establish expectations for an FMI's critical service providers in order to support the FMI's overall safety and efficiency. The expectations outlined below are specifically targeted at critical service providers." The provision contemplates a regulator, supervisor, or overseer establishing expectations directly at the CSP; the supervisor's reach is not limited to the FMI boundary, and the PFMI does not frame CSP oversight as a purely contractual flow-down from the FMI.

What this tells us about AI for Operations teams at Investment Banking firms

The PFMI's structural surface (numbered Principles, numbered Key Considerations, lettered annexes) is the feature that makes the document amenable to AI summarisation and is also the feature that makes a citation-level failure invisible at runtime. The models in these findings did not refuse, did not hedge, and did not flag uncertainty about the cited Key Consideration. They selected a Key Consideration number from the model's prior, paraphrased a substantively plausible governance position, and pinned the position to a number that does not support it.

For Operations teams at Investment Banking firms, the implication is that AI-assisted PFMI drafting work requires a separate verification step in which the cited Key Consideration is opened in the regulator's published text and matched against the position the draft attributes to it. The cost of skipping that step lands in the published work product, not in the AI tool.

What the RLB Specialist Panel is doing about it

The RegLeg Brief Specialist Panel records and documents each confirmed hallucination with an immutable RLB Citation ID, a verbatim copy of the AI subject's exact output, the verbatim regulator-issued source text, and a named failure-class label. The records are open-access and ungated, and the Specialist Panel operates with an aggregate of over 60 years of public-policy and industry experience in financial market infrastructure regulation. AI labs whose subjects appear in any finding have an unconditional right of reply; the Specialist Panel will document any factual correction or contextual response alongside the original finding, with no editorial gatekeeping.

The Panel runs the PFMI test bench on a continuous basis: as model versions and PFMI implementation guidance evolve, the same Specialist Panel questions are re-run against the current subjects, and any change in the failure shape is documented to the same citation ID family. The Panel does not redact the exact strings the AI subjects produced, and does not paraphrase the regulator-issued source text; the value of the audit record sits in the verbatim pairing, and that record stays usable for AI engineering teams, supervisors, and practitioners who need to reproduce the finding against their own deployments.

For the PFMI specifically, the Specialist Panel maintains the source binding to the published primary text and expands the bench as additional Principles and Annexes are tested.

What Operations teams at Investment Banking firms teams should do

The verification cost on the front end is measured in minutes per Key Consideration: open the regulator's PFMI document to the cited number, read the verbatim text, and confirm the AI output's substantive claim is supported. The recovery cost on the back end, if the failure reaches the published work product, is measured in supervisor remediation cycles, counterparty disputes, and reputational exposure for the firm and the practitioner who signed off the draft.

The RegLeg Brief Specialist Panel records each finding so that the failure shape is recognisable on sight, and so the verification step can be targeted at the specific failure class the model exhibits on PFMI material rather than at every line of every output.


Right of Reply

These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.

Source & Methodology Standards

RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.


Primary source verified: CPMI-IOSCO PFMI Report d101, Principles for Financial Market Infrastructures (April 2012) · Substrate documents: R2-REGULATION-d101a_PFMI_main_text.pdf · CPMI portal: bis.org/cpmi

Citation IDs referenced:

Read the full findings page — RLB Citation IDs, AI subject answers, and regulator verbatim text →
← Back to all briefings