AI Hallucination on Implementation Monitoring of the PFMI: Level 3 Assessment on General Business Risks for Compliance teams at Investment Banking firms in international jurisdictions

Executive Summary

When Compliance teams at international investment banking firms rely on AI tools to interpret CPMI-IOSCO's November 2025 Level 3 assessment of general business risk under the PFMI, the results are materially unreliable. Across three findings, AI assistants failed on the core technical questions a Compliance function is most likely to use AI for: the precise capital composition of the Principle 15 liquid net assets funded by equity requirement, the structural relationship between the six-month operating expense floor and the wind-down plan funding obligation, and the authoritative timeline of the assessment itself.

Two of the three failures followed a pattern of confident initial assertion followed by retraction under challenge, meaning the error only surfaces if someone pushes back, not if a junior analyst uses the first response as-is. The third failure delivered stale characterisation of the assessment period, directly undermining any internal benchmarking or regulatory submission that cites timing and process provenance.

How AI gets this regulation wrong

The failures on this regulation cluster into two patterns: AI tools that confidently asserted incorrect rules and quietly self-corrected only when challenged, and an instance where AI presented an outdated characterisation of the assessment timeline as current fact. What makes the first pattern particularly dangerous is the surface plausibility, the answers are structurally coherent and cite recognisable PFMI concepts, but the technical detail is wrong in exactly the ways that matter for a capital-adequacy or regulatory-submission context.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	2	Finding#1 · Finding#2
Outdated	1	Finding#3

What that means for your team

The risk profile here splits cleanly between regulatory enforcement exposure, where AI errors on the Principle 15 capital structure land in submissions, policy documents, or counterparty assessments, and wrong-deliverable risk, where internal benchmarking materials or board-level briefings are built on an inaccurate provenance of the assessment itself. Both categories are live problems for a Compliance function whose work product is reviewed by regulators and internal audit alike.

Risk Impact	Count	Affected findings
Regulatory enforcement	2	Finding#1 · Finding#2
Wrong deliverable	1	Finding#3

When this affects your department

Investment banking Compliance teams encounter this regulation primarily through three entry points: assessing CCPs and trade repositories as counterparties or infrastructure providers under third-party risk frameworks, supporting business lines in interpreting how the PFMI's GBR standards affect CCP margin models and clearing mandates, and preparing or reviewing internal capital adequacy mapping for clearing-related exposures. AI tools get pulled into each of these because the November 2025 Level 3 assessment is recent, technically dense, and the source document is not the kind of text a junior analyst routinely has open.

The temptation to run a summary query and pass the answer forward is high.

The Principle 15 LNAFE question is where the regulatory enforcement risk concentrates. If Compliance is advising on whether a CCP counterparty meets its GBR obligations, for credit exposure purposes, for onboarding sign-off, or for internal audit defence on clearing-related controls, an AI answer that inverts the Basel/CRD carve-out or conflates the KC3 six-month floor with the KC4 wind-down funding requirement will produce a materially wrong assessment.

The correct read is that equity held under international risk-based capital standards can count toward LNAFE where relevant and appropriate; AI tools tested here denied that carve-out or collapsed the two independent requirements into one, leaving Compliance with a deficiency assessment that over-states the obligation and may prompt unnecessary escalation or misallocation of capital scrutiny.

The assessment-period error (Finding 3) lands hardest when the team is building benchmark materials, board MI, or regulatory correspondence that contextualises the November 2025 findings against the firm's own CCP oversight cycle. CPMI-IOSCO explicitly characterises the assessment as running 2023–25, inclusive of the FMI validation phase conducted in April 2025. AI tools tested on this question returned "2023 and 2024" as the definitive range, a shorthand drawn from secondary commentary, not the authoritative text. A regulatory submission or internal report that cites the wrong assessment window creates a verifiability gap that any regulator or internal audit team will flag immediately.

The findings at a glance

The three findings below cover the full scope of what AI tools got wrong on this regulation when tested against questions a Compliance team at an international investment bank would realistically ask.

#	Finding title	Type	Citation ID
1	Basel equity carve-out denied in LNAFE calculation	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q002
2	KC3 and KC4 requirements merged into single misattributed formula	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q003
3	Assessment period mischaracterised as 2023–2024 not 2023–25	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q005

Aggregate impact

All three findings cluster on the same technical territory: the mechanics of Principle 15's liquid net assets funded by equity requirement and the provenance of the November 2025 assessment. That concentration is not coincidental, these are exactly the questions a Compliance team drafts into submissions, counterparty assessments, and board MI, because they are the numerical and procedural anchors the regulation provides. AI tools failed at precisely the point where precision matters most.

Findings 1 and 2 represent variants of the same structural error. One AI tool asserted that Basel/CRD-mandated equity cannot count toward LNAFE and must be held entirely on top, the opposite of what KC3's final sentence explicitly permits. A second collapsed the KC3 six-month floor and the KC4 wind-down plan funding requirement into a single composite formula attributed to KC3 alone. Both errors are plausible on first read because they preserve the surface vocabulary of the regulation.

The tell is that in both cases the AI self-corrected under challenge, meaning a Compliance analyst who accepts the first response and moves on will carry the wrong rule forward, while only one who actively interrogates the answer gets the correction. That dynamic is exactly what quality control on AI-assisted work needs to be designed against.

Finding 3 introduces a separate systemic risk: the AI's characterisation of the assessment's timeline was drawn from a secondary commentary source rather than the CPMI-IOSCO publication itself, and the secondary source used a shorthand that dropped the 2025 validation phase from the date range.

For a Compliance team building benchmark materials or preparing a regulatory response, the difference between "2023–2024" and "2023–25" is not cosmetic, it determines whether the FMI validation step is treated as inside or outside the assessment scope, which affects how gaps are characterised and how the firm's own CCP oversight conclusions should be sequenced against the publication date.

What your team should do

The default position for Compliance work on this regulation should be: AI tools are useful for orientation and drafting structure, but not for technical rule statements on Principle 15 capital mechanics or for citing procedural details about the assessment itself. Both of those categories require primary-source verification before anything leaves the team, and these findings show the errors will not always announce themselves. The LNAFE failures in particular produced structurally coherent, confident answers that only revealed themselves as wrong under direct challenge. A quality-control workflow that only checks outputs against secondary commentary will not catch them.

For counterparty assessments and CCP oversight work, the practical safeguard is to maintain a standing reference against the KC3 and KC4 text before any AI-assisted output is passed to a business line, included in board MI, or used in regulatory correspondence. The specific risk to verify against: that equity held under international risk-based capital standards can be included in the LNAFE calculation where relevant and appropriate (KC3 final sentence), and that the KC4 wind-down plan funding requirement is a separate, additional obligation, not merged into the KC3 floor. These are the two precise points where AI tools failed.

For benchmarking and regulatory submission work that references the November 2025 assessment, verify the assessment period against the CPMI-IOSCO publication directly. The authoritative characterisation is 2023–25, inclusive of the April 2025 FMI validation phase. AI is reliable for summarising the structural findings of the assessment and for drafting the framing sections of internal reports, it is the provenance details (dates, participation basis, validation methodology) that require a direct read of the source, not a secondary summary.

How RLB Can Help

RegLeg's published Hallucination Research gives Compliance teams at investment banks a practical pre-flight check before acting on AI-generated regulatory output. Because the research spans regulators across multiple jurisdictions and documents the specific failure modes that occur when AI tools engage with financial services rules, Compliance staff can consult the findings as an independent reference, confirming where AI-assisted research is reliable, and flagging the regulatory domains where confident-sounding output has most frequently proved incorrect.

For firms that want to go further, RegLeg offers bespoke regulator deep-dives scoped to the workflows your Compliance function actually relies on. This means mapping which AI-supported activities, regulatory horizon scanning, policy gap analysis, transaction monitoring guidance, or senior manager accountability queries, carry the highest hallucination exposure in your specific operating environment, and prioritising attention accordingly. Where an investment bank is subject to a regulator whose track record in the published research gives cause for caution, that context is built into the engagement from the outset.

RegLeg also works with Compliance teams on a confidential review of existing AI-use policies, assessing them against a structured failure-mode catalogue drawn from the research. The output is a prioritised remediation plan that identifies gaps in current oversight controls and suggests practical adjustments, including escalation triggers, secondary-verification requirements, and human sign-off thresholds suited to a regulated institution. Firms that have completed the review have used the findings directly as the basis for CPD-aligned internal training, giving Compliance staff the working knowledge they need to apply appropriate scepticism to AI tools without abandoning the efficiency gains they provide.