AI Hallucination on Implementation Monitoring of the PFMI: Level 3 Assessment on General Business Risks for Lawyers in international jurisdictions

Executive Summary

The November 2025 CPMI-IOSCO Level 3 assessment on general business risk is a technically dense document: its value to practitioners lies almost entirely in the granular KC-level text of PFMI Principle 15 and the assessment's findings on how FMIs are implementing it. AI assistants we tested produced wrong answers on this regulation across four distinct questions, every single one resulting in a deliverable that would be wrong on the law.

The failures cluster on Principle 15's Key Consideration architecture: AI tools repeatedly confused which KC carries which obligation, invented a dual-track minimum that does not exist in KC3, and in one instance flatly denied that a Basel carve-out appears in KC3 at all before retracting under challenge. For lawyers advising CCPs, CSDs, or trade repositories on Principle 15 compliance, whether in the context of regulatory examination, restructuring, or cross-border equivalence analysis, an AI-drafted memo that gets the KC structure wrong is not a rough draft; it is a liability.

How AI gets this regulation wrong

On this regulation, AI tools failed in two ways that are particularly hazardous for legal drafting: they invented rules, merging obligations from different Key Considerations into composite requirements that appear nowhere in the text, and, in one case, gave outdated information about the assessment's timeline as if it were settled fact. The dominant pattern is confident fabrication followed by retraction under challenge, which means the error only surfaces if someone in the workflow knows to push back.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	2	Finding#1 · Finding#2
Outdated	1	Finding#3

What that means for your practice

Every failure identified in this regulation falls into the same risk category: the AI produces a wrong deliverable. For lawyers, that means an advice memo, an internal briefing, or a client-facing compliance opinion that states the law incorrectly, errors on the quantitative floor, the Basel carve-out condition, or the KC numbering that a sophisticated counterparty or regulator will catch. The table below maps each finding to the practice workflow where the wrong output surfaces.

Risk Impact	Count	Affected findings
Wrong deliverable	2	Finding#2 · Finding#3
Liability / PI exposure	1	Finding#1

When this affects Lawyers

The most common touchpoints are Principle 15 compliance opinions, where counsel is asked to confirm whether an FMI's LNAFE methodology, its eligible equity instruments, and its segregation approach satisfy KC3. These opinions travel: they get attached to regulatory submissions, referenced in exam prep, and cited in restructuring due diligence. If the AI has silently merged KC2's scenario-analysis sizing test with KC3's six-month floor, or stated that KC3 contains a liquidity test it does not contain, the opinion is wrong from the first paragraph.

The Basel carve-out question arises specifically when CCPs holding CET1 under Basel III or CRD ask whether that equity counts towards LNAFE. The KC3 answer is a qualified yes, "where relevant and appropriate to avoid duplicate capital requirements." That qualifier is the entire legal analysis. An AI that either invents a different condition (a KC4 liquidity screen) or flatly denies the carve-out exists produces advice that will either cause the client to unnecessarily exclude eligible capital or, conversely, to rely on capital that does not meet the actual condition.

A third scenario is regulatory engagement work: preparing methodology notes on the CPMI-IOSCO assessment itself, advising on the FIA/ISDA consultation response, or briefing a CCP's board on what the Level 3 findings mean for their Principle 15 programme. Here the timeline error matters, if the note states the assessment ran through 2024 when the regulator's own document extends the engagement phase to April 2025, the client is working from a materially incomplete picture of the assessment's scope and findings-sharing process.

The findings at a glance

Four findings, each representing one or more AI tools tested against a specific Principle 15 question, are summarised below, with the failure mode and the workflow risk each one creates for lawyers.

#	Finding title	Type	Citation ID
1	KC3 Basel equity carve-out, invented liquidity condition / outright denial	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q002
2	KC3 six-month LNAFE minimum, misattributed to KC2	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q003
3	Assessment timeline, 2023–24 stated; correct period runs through April 2025	Hallucination	RLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q005

Aggregate impact

Three of the four findings are variations on the same structural error: AI tools do not hold Principle 15's KC architecture accurately in their working model. KC2 and KC3 carry distinct but adjacent obligations, KC2 governs the sizing of the general business loss buffer through scenario analysis; KC3 sets the six-month floor and the conditions for counting Basel-regulated equity. AI tools repeatedly conflate the two, producing hybrid tests that read as authoritative but are invented.

The pattern is not random noise; it is a consistent mis-mapping of KC-level granularity that will produce the same wrong answer across different queries on the same regulation.

The fourth finding, the timeline error on the assessment's duration, is structurally different but equally problematic in practice. It reflects AI tools anchoring on secondary-source shorthand (the data-collection window described as "2023–24") rather than the primary document's full lifecycle description. For lawyers drafting regulatory engagement summaries or advising on the assessment's findings, attributing the wrong end-date to the process is not a cosmetic error: it misrepresents the scope of FMI engagement and the findings-sharing phase, which affects how the assessment's conclusions are characterised.

Taken together, these findings mean that AI tools are unreliable across the two main types of Principle 15 work a lawyer is likely to do: (1) substantive compliance analysis that requires accurate KC attribution, and (2) regulatory-process work that requires accurate characterisation of the assessment itself. The failures are not edge cases, they arise on the most foundational questions: what is the floor, how is it structured, what qualifies, and when did the regulator conduct its review.

What your team should do

The default position for Principle 15 legal work is straightforward: treat AI-generated KC-level analysis as a starting hypothesis, not a citable answer. Every KC attribution, which KC carries the six-month floor, which KC governs eligible equity, which KC imposes segregation, must be verified against the PFMI text directly. This is not a burdensome check; the relevant Principle 15 KC text is short. The problem is that AI tools produce responses that sound precise and appropriately hedged, which suppresses the instinct to verify.

Instil in juniors the habit of pulling the KC verbatim before drafting any substantive Principle 15 analysis, regardless of how specific the AI's answer appears.

For the Basel carve-out specifically, the condition that equity held under international risk-based capital standards can count toward LNAFE "where relevant and appropriate to avoid duplicate capital requirements", do not rely on AI to reproduce the qualifier accurately. The findings show that AI tools either replace it with a different test or deny it exists. When advising a CCP on whether its CET1 qualifies, the analysis lives in those eight words; get them from the primary source.

Where AI tools are useful in this space: procedural and structural orientation, identifying that Principle 15 has five Key Considerations, locating where general business risk sits in the broader PFMI architecture, or summarising the categories of FMI covered by the assessment. AI is also reasonably reliable for background on the FIA/ISDA consultation response at a thematic level, provided the team does not rely on AI for the specific proposals those organisations made. For any KC-level text, any quantitative threshold, and any characterisation of the assessment's timeline or methodology, go to the primary document.

How RLB Can Help

RegLeg's published Hallucination Research is available as a free pre-flight check for lawyers working on regulatory matters. Before relying on AI-assisted output, whether for advice, drafting, or due diligence, lawyers can consult the research to understand which failure modes have been observed for the specific regulation in question. This is not a substitute for legal judgement, but it is a structured, independent reference that flags where AI tools have historically misfired, allowing practitioners to focus their human verification effort on the highest-risk points.

For firms where multiple lawyers work across the same regulatory portfolio, RegLeg offers bespoke deep-dive engagements. These go beyond the published research to examine the specific regulations, jurisdictions, and question types most relevant to the firm's practice. The output is a tailored briefing that legal teams can use as a standing reference, updated as the regulatory landscape evolves, giving the whole team a shared, consistent picture of where AI tools should be treated with caution and where they have performed reliably.

RegLeg also works with legal teams on training and CPD-aligned content. This covers the categories of failure lawyers are most likely to encounter, including outdated regulatory text, cross-jurisdictional confusion, and misattributed citations, framed around real regulatory examples rather than abstract AI theory. Separately, RegLeg can conduct a confidential review of a firm's existing AI-use policy, assessing it against the failure-mode catalogue the research has surfaced. The output is a structured gap analysis: which risks the policy already addresses, which it does not, and where practical amendments would strengthen the firm's position.