AI Hallucination ResearchBriefings › Briefing
Sector × Dept INT BIS-CPMI
Payment Institutions Legal teams · Principles for Financial Market Infrastructures (PFMI)

By Kratti A Agrawal, Lead, RegLeg Brief Specialist Panel

Payment Institutions Legal teams: documentation and reporting gaps possible from AI reading of PFMI (Principles for Financial Market Infrastructures)

Opus decodes the hallucination grammar in PFMI 2012 payment institution legal obligations.

— RLB Specialist Panel

Frontier AI models invented and misattributed PFMI Key Considerations, regulatory-research panel finds

A frontier AI model tested by the RegLeg Brief Specialist Panel documented Annex F's regulator-to-CSP oversight scope as FMI-internal only, framing the PFMI's parallel supervisor channel out of existence.

SINGAPORE. Two frontier artificial-intelligence models generated structurally confident but textually wrong reconstructions of the CPMI-IOSCO Principles for Financial Market Infrastructures (PFMI, 2012), the global standard for systemically important payment systems, central counterparties, and securities settlement infrastructures, according to findings documented by the RegLeg Brief Specialist Panel, the regulatory-research function operated by Singapore-incorporated Verdus Technologies Pte. Ltd. The verbatim regulator-issued source text is held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf).

The pattern in one line

For Legal teams at Payment Institutions firms, the pattern is this: the model gets the governance topic substantively in range, then anchors the position to a named PFMI Key Consideration or annex that does not support it. The substantive paraphrase reads as competent regulatory analysis; the citation underneath it does not check out against the published PFMI text. The reader cannot see the failure without opening the PFMI document and reading the cited Key Consideration.

How the RLB Specialist Panel tested this

Questions are prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for. The Panel binds each AI finding to verbatim regulator-issued source text held as primary substrate. For PFMI, the Panel ran two question shapes against frontier AI subjects: Specialist Panel direct questions on what the PFMI text states about CSP oversight and on what Principle 2 Key Consideration 6 requires, and Specialist Panel application-style questions framed in the voice of a practitioner preparing a disclosure-framework response or a board terms-of-reference draft.

Two frontier AI subjects were tested with web search enabled. The Panel does not paraphrase model output; the exact strings the models produced are matched against the regulator-issued primary substrate, and only those answers that diverge from the substrate text are recorded as confirmed hallucinations.

What the models got wrong

Finding RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46: Annex F supervisor-scope inversion. Claude Sonnet 4.6, with web search enabled, was asked what oversight expectations Annex F establishes for critical service providers, and whether regulators can direct those expectations at CSPs independently of the FMI. The model answered, verbatim, that "under the PFMI framework, authorities do not directly supervise or oversee CSPs. The responsibility for ensuring CSP compliance with Annex F expectations rests with the FMI itself; the FMI is expected to contractually require and monitor its CSPs' adherence.

Annex F is framed as expectations that flow from the FMI to its CSPs." The PFMI text contradicts that framing directly. Annex F opens: "A regulator, supervisor, or overseer of an FMI may want to establish expectations for an FMI's critical service providers in order to support the FMI's overall safety and efficiency. The expectations outlined below are specifically targeted at critical service providers." The PFMI text contemplates a parallel regulator-to-CSP oversight channel; the model's output frames that channel out of existence and recasts CSP oversight as FMI-internal only.

Legal teams at payment institutions signing off on opinions or contract structures for CSP mandates are the population most exposed when AI output documents the supervisory relationship as purely contractual and FMI-internal, because the opinion carries the firm's name to a counterparty or supervisor who can locate Annex F and see the parallel regulator-to-CSP oversight channel the text contemplates.

For this audience, the work product is legal opinions on FMI third-party oversight, CSP-mandate contract structures, counterparty representations on PFMI Annex F compliance, and supervisor-engagement scope memoranda. Every item on that list travels under the firm's name to a reader who can locate the cited Key Consideration or annex in the published PFMI document and read it.

The failure pattern is not recoverable at the desktop because the model's output reads as a competent governance paragraph: it uses defined terms correctly (FMI, KC, CSP, Annex F), it tracks the PFMI's structural vocabulary, and it states a position the reader expects to read. The error surfaces only when the reader opens the PFMI and locates the cited Key Consideration, at which point Legal teams' draft is exposed as having attributed a position to the regulator's text that the text does not contain.

The regulator's actual position

The verbatim regulator-issued source text held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf) supports the following positions, each of which contradicts the corresponding AI output.

On Annex F supervisor scope. The published PFMI Annex F opening text reads: "A regulator, supervisor, or overseer of an FMI may want to establish expectations for an FMI's critical service providers in order to support the FMI's overall safety and efficiency. The expectations outlined below are specifically targeted at critical service providers." The provision contemplates a regulator, supervisor, or overseer establishing expectations directly at the CSP; the supervisor's reach is not limited to the FMI boundary, and the PFMI does not frame CSP oversight as a purely contractual flow-down from the FMI.

The PFMI's structural surface (numbered Principles, numbered Key Considerations, lettered annexes) is the feature that makes the document amenable to AI summarisation and is also the feature that makes a citation-level failure invisible at runtime. The models in these findings did not refuse, did not hedge, and did not flag uncertainty about the cited Key Consideration. They selected a Key Consideration number from the model's prior, paraphrased a substantively plausible governance position, and pinned the position to a number that does not support it.

For Legal teams at Payment Institutions firms, the implication is that AI-assisted PFMI drafting work requires a separate verification step in which the cited Key Consideration is opened in the regulator's published text and matched against the position the draft attributes to it. The cost of skipping that step lands in the published work product, not in the AI tool.

What the RLB Specialist Panel is doing about it

The RegLeg Brief Specialist Panel records and documents each confirmed hallucination with an immutable RLB Citation ID, a verbatim copy of the AI subject's exact output, the verbatim regulator-issued source text, and a named failure-class label. The records are open-access and ungated, and the Specialist Panel operates with an aggregate of over 60 years of public-policy and industry experience in financial market infrastructure regulation. AI labs whose subjects appear in any finding have an unconditional right of reply; the Specialist Panel will document any factual correction or contextual response alongside the original finding, with no editorial gatekeeping.

The Panel runs the PFMI test bench on a continuous basis: as model versions and PFMI implementation guidance evolve, the same Specialist Panel questions are re-run against the current subjects, and any change in the failure shape is documented to the same citation ID family. The Panel does not redact the exact strings the AI subjects produced, and does not paraphrase the regulator-issued source text; the value of the audit record sits in the verbatim pairing, and that record stays usable for AI engineering teams, supervisors, and practitioners who need to reproduce the finding against their own deployments.

For the PFMI specifically, the Specialist Panel maintains the source binding to the published primary text and expands the bench as additional Principles and Annexes are tested.

The verification cost on the front end is measured in minutes per Key Consideration: open the regulator's PFMI document to the cited number, read the verbatim text, and confirm the AI output's substantive claim is supported. The recovery cost on the back end, if the failure reaches the published work product, is measured in supervisor remediation cycles, counterparty disputes, and reputational exposure for the firm and the practitioner who signed off the draft.

The RegLeg Brief Specialist Panel records each finding so that the failure shape is recognisable on sight, and so the verification step can be targeted at the specific failure class the model exhibits on PFMI material rather than at every line of every output.


Right of Reply

These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.

Source & Methodology Standards

RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.


Primary source verified: CPMI-IOSCO PFMI Report d101, Principles for Financial Market Infrastructures (April 2012) · Substrate documents: R2-REGULATION-d101a_PFMI_main_text.pdf · CPMI portal: bis.org/cpmi

Citation IDs referenced:

Read the full findings page — RLB Citation IDs, AI subject answers, and regulator verbatim text →
← Back to all briefings