AI Hallucination ResearchFindings by audiencePractitionersInternational / MultilateralCompany Secretaries › Principles for Financial Market Infrastructures (PFMI)
Practitioners — Company Secretaries · Last updated 11 Jun 2026 · methodology v2.3 · Hallucination Register
Share / Print X LinkedIn Email

AI Hallucination on Principles for Financial Market Infrastructures (PFMI) for Company Secretaries in international jurisdictions

Company Secretaries: AI summaries of PFMI may understate professional obligations

Company Secretaries working on the CPMI-IOSCO Principles for Financial Market Infrastructures (PFMI, 2012) are increasingly relying on AI to draft board and risk-committee terms of reference, prepare papers for the board's oversight of critical service providers, validate committee mandates against the PFMI's published Key Considerations, and assemble governance disclosures for the FMI's annual disclosure-framework return. The PFMI framework is the global standard for systemically important payment systems, central counterparties, and securities settlement infrastructures, and the document's structure makes it particularly amenable to AI summarisation: numbered Principles, numbered Key Considerations, and lettered annexes that the model can address by number.

That surface structure is also what makes the failure mode the RegLeg Brief Specialist Panel records here invisible at runtime: the document is regularly cited by Key Consideration number in board papers, disclosure-framework returns, and counterparty representations, which means a misattributed citation does not register as a substantive error in the draft, it registers as a competent regulatory paragraph that the reader will not check against the regulator's primary text unless something else prompts the verification.

Two frontier AI models tested by the RegLeg Brief Specialist Panel produced confidently wrong reconstructions of the PFMI's governance and oversight architecture under Principle 2 (governance) and Annex F (oversight expectations for critical service providers). The Panel records two findings in the class the team labels "Source-Credit Fabrication and Supervisor-Scope Inversion", in which the models stated a substantively plausible governance position and pinned it to a named Key Consideration that the published PFMI text does not support. The finding identifiers are RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46, RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q022-Opus47.

For Company Secretaries, the failure shape matters because the work product is board charters, risk-committee terms of reference, board information papers on third-party oversight, and PFMI disclosure-framework responses, all of which travel under the firm's name to a board, supervisor, counterparty, or public reviewer who can locate the cited Key Consideration and check it against the regulator's primary text. Company Secretaries who route AI-drafted board and committee documentation into the FMI's governance pack are the population most exposed when the model misnumbers a Key Consideration or fabricates a committee-architecture mandate that the PFMI text does not contain.

The Panel documents the finding identifiers RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011-Sonnet46; RLB-H-INT-BIS-CPMI-IOSCO-PFMI-2012-Q022-Opus47. The AI subjects under test were Claude Opus 4.7 and Claude Sonnet 4.6, each running with web search enabled, mirroring the workflow most practitioners run when they ask an assistant a Principle 2 or Annex F question. The verbatim regulator text is held as primary substrate (R2-REGULATION-d101a_PFMI_main_text.pdf). Each finding card sets out the exact strings the model produced, the verbatim regulator excerpt the model's output contradicts, and the failure-class label the RegLeg Brief Specialist Panel assigns.

The records are open-access; AI labs named in any finding have an unconditional right of reply, and the Specialist Panel will document any factual correction or contextual response alongside the original finding.

Executive Summary

The Principles for Financial Market Infrastructures (PFMI), published by the Bank for International Settlements Committee on Payments and Market Infrastructures (CPMI) and IOSCO in April 2012, is the international standard governing the design and operation of systemically critical financial market infrastructures, covering central counterparties, central securities depositories, payment systems, and trade repositories. The 28 Principles, their underlying Key Considerations, and the companion assessment methodology together form the operative compliance framework that national supervisors apply to FMIs across major jurisdictions.

Company Secretaries supporting boards and committees at FMIs, FMI participants, and firms operating across PFMI-implementing jurisdictions rely on the PFMI's Principle 2 governance text and Annex F's oversight provisions when drafting board mandates, committee terms of reference, and policy papers that cite the framework as the regulatory basis for governance design choices. RegLeg tested two frontier AI tools (Claude Opus 4.7 and Claude Sonnet 4.6, each with web search enabled) against the PFMI's Principle 2 governance text and Annex F oversight provisions, and documented 2 hallucinations relevant to this audience.

The dominant failure pattern is not blanket fabrication of regulatory language but a structurally confident substitution of training-data priors for the regulator's actual KC-level wording, producing outputs that read like authoritative regulatory analysis while resting on citations that do not survive a check against the PFMI text itself. For Company Secretaries in international jurisdictions, the practical consequence is that any work product that inherits these AI outputs without independent verification against the regulator's published text carries embedded misstatements of what the PFMI requires.

How AI gets this regulation wrong

The table below catalogues how AI tools err on this regulation. The pattern observed across all 2 findings is structural: the AI tools substituted training-data priors, generic corporate-governance language, common committee-design conventions, conventional FMI-internal framings of oversight, for the PFMI's actual KC-level text. In each case the AI output carried surface features of authoritative regulatory analysis (specific KC numbers, quoted phrasing, internal cross-references) while the underlying citation did not check out against the PFMI's published text.

AI's Failure ModeCountAffected findings
Inference Drift2Finding#1 · Finding#2

What that means for your practice

For Company Secretaries supporting boards and risk committees at FMIs and FMI participants, the findings in this cell translate into a risk of misstated governance citations in board papers, committee mandates, and supervisor-facing self-assessments. The table below shows how that risk distributes across the individual findings.

Risk ImpactCountAffected findings
Liability / PI exposure2Finding#1 · Finding#2

When this affects Company Secretaries

PFMI-related work reaches Company Secretaries most often when the firm participates in, operates, or is supervised against a financial market infrastructure subject to the PFMI standards. The 28 Principles, their Key Considerations, and the companion assessment methodology together drive the supervisor's PFMI assessment cycle and the FMI's disclosure-framework responses, both of which generate documentation work that crosses the company secretaries's desk regularly.

The specific findings in this cell show where AI tools fail on the PFMI: on Finding#1 (Annex F critical service provider oversight, supervisory scope inverted) the Claude Sonnet 4.6 (web search on) produced output that aI inverted Annex F's regulator-to-CSP oversight provision, asserting that authorities cannot directly establish CSP expectations when the Annex's own text grants them that authority.; on Finding#2 (PFMI Principle 2 KC 6, non-executive risk-committee chair mandate fabricated) the Claude Opus 4.7 (web search on) produced output that aI fabricated a non-executive risk-committee chair requirement at KC 6 of Principle 2, where the actual text addresses only the documented risk-management framework and the independence of control functions..

These are not edge questions. Principle 2's governance text and Annex F's oversight scope are live issues in every PFMI assessment cycle and in routine supervisory engagement with FMIs and their participants.

The structural risk for Company Secretaries is that AI-assisted research, used to accelerate first-pass orientation on a specific KC or annex provision, can introduce confident-but-wrong citations into working documents that are then circulated, cross-referenced in other policies, or submitted to counterparties and supervisors. The PFMI's KC-level structure means that an error at the KC-number or annex-scope level is hard to detect by skim review: it surfaces only when the reviewer goes back to the regulator's actual text, which is the question the team had asked the AI in the first place.

The findings at a glance

The table below lists each PFMI finding tested in this cell, showing the topic, the failure type, and the citation ID. Use it to identify which findings are most directly relevant to current work for Company Secretaries in international jurisdictions.

#Finding titleTypeCitation ID
1Annex F critical service provider oversight — supervisory scope invertedHallucinationRLB-F-INT-BIS-CPMI-IOSCO-PFMI-2012-Q011
2PFMI Principle 2 KC 6 — non-executive risk-committee chair mandate fabricatedHallucinationRLB-F-INT-BIS-CPMI-IOSCO-PFMI-2012-Q022

Aggregate impact

The findings in this cell cluster around a single underlying failure shape: AI tools reconstructing the PFMI's governance and oversight architecture from training-data priors rather than from the regulator's actual KC-level text.

The drift takes several visible forms across the findings, a fabricated non-executive risk-committee chair mandate grafted onto KC 6, a soft risk-committee recommendation misattributed to KC 5 (whose actual subject is management roles), and an inverted reading of Annex F that converts a regulator-to-CSP oversight channel into an FMI-internalised contractual obligation, but the common substrate is the same: a model prior about how FMI governance and oversight "should" be structured overrides the PFMI's actual structural decisions.

For Company Secretaries, the systemic risk is not just that any one answer is wrong. It is that the AI output's surface structure, specific KC citations, quoted regulatory phrasing, named annex provisions, closely mimics the format of well-researched regulatory analysis. The error is invisible on review because the form looks authoritative; detection requires going back to the PFMI's published text and reading the cited KC for itself.

The broader pattern visible across the PFMI findings says something specific about model behaviour on dense regulatory frameworks: where a KC's actual text is at the framework level ("the board should establish a documented risk-management framework"), the AI tools we tested filled in the implementation detail ("the risk committee should be chaired by a non-executive member") from corporate-governance priors rather than from the source document. For PFMI specifically, that drift is consequential because the framework is structured precisely to leave implementation choices to the FMI, while binding the FMI to framework-level obligations whose actual text matters.

What your team should do

The default position for Company Secretaries teams working on PFMI matters should be to treat any AI output that supplies a specific PFMI citation, a Principle number, a Key Consideration reference, a quoted KC phrase, an Annex F provision, as a suggested search term, not a verified citation. The PFMI is available in full from the BIS publications portal, and the structure (Principles, Key Considerations, Annexes) is designed to be read directly. Any work product that will be presented to the board, a counterparty, or a regulator should build in a mandatory citation-verification step before the document leaves the team.

AI tools are most safely used here at the framing stage: identifying which Principles or annex provisions are likely to be relevant to a specific question, drafting the structure of a board paper or compliance mapping before the regulatory content is filled in from primary sources, or explaining the broader policy rationale behind the PFMI's framework choices. The risk sits entirely in the next step: asking the AI to supply the actual KC wording, the specific committee design requirement, the precise scope of Annex F oversight.

The findings in this cell show that AI outputs at that level are unreliable in ways that look authoritative.

For workflows where multiple company secretaries use AI assistance on PFMI questions, a simple challenge protocol helps: ask the AI to identify the exact source and page location for any specific KC or annex claim, and treat an evasive or hedged response as a signal to verify against the source document. A team standard that requires every PFMI citation in an outbound document to trace back to a verified line in the published PFMI text is the most reliable safeguard against the failure modes documented in this cell.

How RLB Can Help

RegLeg's published Hallucination Research gives Company Secretaries a structured pre-flight check before relying on AI tools for PFMI questions. The research surfaces precisely which PFMI Principles, Key Considerations, and annex provisions have generated confident-but-wrong AI output, and it does so with full citation ID trails back to the specific model output and the regulator's verbatim text.

Teams can use it as a targeted reference: when an AI tool produces a PFMI citation, the research tells you whether that area has a documented failure pattern and what the specific failure shape looks like, so you can apply directed human scrutiny rather than blanket scepticism.

Beyond the published research, RegLeg works with firms on bespoke deep-dives that map AI-supported workflows against documented failure modes for a specific regulator portfolio. For PFMI work specifically, that includes the Principle-by-Principle failure map, the Annex F oversight-scope failure pattern, and the cross-document fragility across the broader CPMI-IOSCO publication family (assessment methodologies, Level 3 assessments, CCP resilience guidance, stablecoin applicability guidance). The output is designed for sharing across a practice group or department and for use as a durable internal reference.

RegLeg also develops training and CPD-aligned content that translates the failure-mode catalogue into practical guidance for Company Secretaries. This material covers the classes of AI error that show up most often on dense regulatory frameworks like the PFMI, fabricated KC-number citations, generic-prior interference on governance text, scope inversions on regulator-facing provisions, and explains how to structure review and verification routines that catch the errors without slowing the team down.

RegLeg also offers a confidential review of a firm's existing AI-use policy against the failure-mode catalogue, identifying gaps between the policy's assumptions and the documented evidence of how AI tools behave on PFMI questions in practice.

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.