Operations teams at Investment Banking firms working on the CPMI-IOSCO Principles for Financial Market Infrastructures (PFMI, 2012) are increasingly relying on AI to draft CSP contract governance and escalation runbooks under an FMI's operating model, prepare incident-response playbooks for CSP performance failures, structure third-party performance monitoring against the FMI's mandate, and validate supervisor engagement protocols against the regulator-issued Annex F text.
The PFMI framework is the global standard for systemically important payment systems, central counterparties, and securities settlement infrastructures, and the document's structure makes it particularly amenable to AI summarisation: numbered Principles, numbered Key Considerations, and lettered annexes that the model can address by number.
That surface structure is also what makes the failure mode the RegLeg Brief Specialist Panel records here invisible at runtime: the document is regularly cited by Key Consideration number in board papers, disclosure-framework returns, and counterparty representations, which means a misattributed citation does not register as a substantive error in the draft, it registers as a competent regulatory paragraph that the reader will not check against the regulator's primary text unless something else prompts the verification.
Two frontier AI models tested by the RegLeg Brief Specialist Panel produced confidently wrong reconstructions of the PFMI's governance and oversight architecture under Principle 2 (governance) and Annex F (oversight expectations for critical service providers). The Panel records one finding in the class the team labels "Supervisor-Scope Inversion", in which the models stated a substantively plausible governance position and pinned it to a named Key Consideration that the published PFMI text does not support. The finding identifiers are RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46.
For Operations teams at Investment Banking firms, the failure shape matters because the work product is CSP-relationship governance procedures, incident-response and escalation runbooks, third-party performance monitoring schedules, and supervisor-engagement protocols for outsourced services, all of which travel under the firm's name to a board, supervisor, counterparty, or public reviewer who can locate the cited Key Consideration and check it against the regulator's primary text.
Operations teams at investment banks owning the live operating model for CSP relationships under an FMI mandate are the population most exposed when AI output frames supervisor reach as ending at the FMI boundary, because the runbook will not anticipate a regulator-driven inquiry that contacts the CSP directly under Annex F.
The Panel documents the finding identifiers RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46. The AI subjects under test were Claude Sonnet 4.6, each running with web search enabled, mirroring the workflow most practitioners run when they ask an assistant a Principle 2 or Annex F question. The verbatim regulator text is held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf). Each finding card sets out the exact strings the model produced, the verbatim regulator excerpt the model's output contradicts, and the failure-class label the RegLeg Brief Specialist Panel assigns.
The records are open-access; AI labs named in any finding have an unconditional right of reply, and the Specialist Panel will document any factual correction or contextual response alongside the original finding.
This is the consolidated view of findings. Click the Citation IDs or 'see details →' on any item for the full details for each finding.
For Operations teams at Investment Banking firms teams managing live CSP relationships under an FMI's operating model, the inverted scope would frame CSP performance failures as a purely contractual escalation path without contemplating a regulator-driven inquiry that may reach the CSP directly. Annex F's opening provision creates a parallel channel that operations leads need to anticipate in CSP contract governance and incident response runbooks.
Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.