AI Hallucination ResearchFindings by audienceSectorsInternational / MultilateralPayment InstitutionsTreasury › Implementation Monitoring of the PFMI: Level 3 Assessment on General Business Risks
Payment Institutions × Treasury — International / Multilateral · Last updated 11 Jun 2026 · methodology v2.3 · Hallucination Register
Share / Print X LinkedIn Email

AI Hallucination on Implementation Monitoring of the PFMI: Level 3 Assessment on General Business Risks for Treasury teams at Payment Institutions firms in international jurisdictions

Payment Institutions Treasury teams: documentation and reporting gaps possible from AI reading of PFMI Level 3 General Business Risk (2025)

Treasury teams at payment institutions are increasingly using AI to draft LNAFE buffer composition memos for the group treasurer, validate Basel-versus-LNAFE capital eligibility across legal entities, prepare quarterly liquidity-buffer trend commentaries, and scope cross-cycle treasury planning against the November 2025 CPMI-IOSCO Level 3 cycle. The November 2025 CPMI-IOSCO Level 3 assessment of general business risk, recorded under PFMI Principle 15, is the supervisory exercise most directly bearing on this practice area in the current cycle.

As AI tooling enters the drafting layer, the question is no longer whether AI-assisted work product reaches client-facing deliverables; it is whether the work product reaches them with the regulator-text fidelity that PI Treasury teams need.

The RLB Specialist Panel tested two frontier AI models on a question set covering the LNAFE quantitative floor, the Basel/CRD equity carve-out condition, and the November 2025 assessment lifecycle. The Panel records 2 findings on this audience-specific cell. The failure pattern in scope: Quantitative-floor inflation into a fabricated composite minimum; Outright denial of a carve-out the rule records explicitly. Questions are prepared by the RLB Specialist Panel based on real practical AI usage in the workflows the respective audience uses AI for. The Panel binds each AI finding to verbatim regulator-issued source text held as primary substrate.

For PI Treasury teams the operational consequence is direct. A treasurer's memo that frames KC3 as a "greater of" dual-track minimum overstates the regulatory floor, and a Basel eligibility memo that imports a liquidity test that does not appear in KC3 understates the eligible equity pool; either framing miscalibrates the treasury plan.

PFMI Principle 15 is one of the cleanest primary-source surfaces in the cross-border CCP and CSD universe: a Key Consideration cited in a deliverable is either the right KC or it is not; a quantitative floor is either the regulator's text or it is not; an assessment-period date range is either accurate or it is not. Each is recoverable on a routine line-by-line read.

The audit's 2 findings for this cell carry immutable RLB Citation IDs and are bound to verbatim regulator-issued source text held by the RLB Specialist Panel: RLB-H-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q003-Opus47, RLB-H-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q002-Sonnet46. The full audit on the November 2025 CPMI-IOSCO Level 3 assessment is published at the PFMI Level 3 General Business Risk hub on RegLegBrief.com.

Executive Summary

The PFMI Level 3 assessment examines whether FMIs actually implement Principle 15's general business risk standard, in particular the liquid net assets funded by equity (LNAFE) provisions that translate the standard into a buffer a firm must hold, instruments it can count, and conditions it must satisfy. For Treasury teams at Payment Institutions operating across international markets, whether as direct FMI participants, indirect members, or firms whose products route through systemically critical infrastructure, that KC-level detail feeds directly into capital planning, FMI counterparty assessment, and regulatory mapping against equivalent domestic frameworks.

Across three tested questions on this regulation, AI assistants consistently mischaracterised the most operationally consequential provisions: where the six-month LNAFE floor sits in the KC structure, how the LNAFE minimum is calculated, and what condition actually governs whether Basel/CRD equity can be counted. Every failure was an exposed fabrication, AI tools gave confident, specific answers and either inverted their position or admitted uncertainty only when pressed directly against the source text, indicating no reliable access to the underlying standard.

How AI gets this regulation wrong

Every failure recorded on this regulation follows the same pattern: AI tools produced confident, specific answers, complete with Key Consideration references, quantitative thresholds, and qualifying conditions, and retracted or reversed only when pressed directly against the source text. The errors are not peripheral misreadings; they include inverting which Key Consideration contains the operative quantitative minimum, inventing a calculation structure with no basis in the standard, and either fabricating or flatly denying the Basel equity carve-out condition that KC3 expressly states.

AI's Failure ModeCountAffected findings
Exposed Fabrication2Finding#1 · Finding#2

What that means for your team

The risk is concentrated in regulatory enforcement: firms that build LNAFE policy, calibrate capital buffers, or respond to supervisory queries using AI-generated summaries of Principle 15 face direct exposure if the underlying KC mapping is wrong. For a Payment Institution whose liquidity or capital framework references PFMI standards, either because it participates in or connects to an FMI, or because a domestic regulator has applied equivalent requirements, a miscalibrated floor or a misstated Basel carve-out condition is not a drafting inaccuracy; it is a compliance failure that CPMI-IOSCO's Level 3 assessment process is specifically designed to surface.

Risk ImpactCountAffected findings
Regulatory enforcement2Finding#1 · Finding#2

When this affects your department

Treasury teams at Payment Institutions engage with PFMI Principle 15 in a range of operational contexts: mapping the firm's liquidity and capital requirements against FMI membership rules, advising business development on the buffer implications of new corridors or settlement arrangements, preparing internal briefings when a domestic regulator applies PFMI-equivalent standards, and conducting due diligence on FMI counterparties as part of settlement and systemic risk assessments.

In cross-border environments where multiple jurisdictions apply the Level 3 framework with different implementation gaps, the KC-level detail matters, and AI tools are a natural shortcut when a business line or a senior stakeholder needs a quick answer on what Principle 15 actually requires.

The specific danger here is that Principle 15's KC2 and KC3 are closely related but structurally distinct: KC2 governs the scenario-analysis sizing obligation; KC3 sets the six-month quantitative floor and the Basel equity carve-out. AI tools tested on this regulation consistently blurred that boundary, attributing KC3's minimum to KC2, merging KC2's scenario analysis into KC3's floor as a fabricated "greater of" construct, and either misrepresenting or denying outright the Basel carve-out condition that KC3 expressly states.

A Treasury analyst who receives that answer, trusts it, and embeds it in a policy paper, capital adequacy memo, or regulatory response has built a compliance position on incorrect text.

The enforcement exposure is direct. CPMI-IOSCO's Level 3 assessment is specifically designed to identify gaps between the published standard and actual FMI implementation, and regulators applying that lens will cross-reference firms' stated LNAFE policies against the KC text. A Payment Institution that has drafted its framework using AI-generated KC attributions, calculated its buffer against a non-existent "greater of" floor, or applied the wrong qualifying condition for Basel equity inclusion will not be able to defend the position under supervisory scrutiny, and the discovery that the error originated from an AI-assisted briefing does not mitigate the compliance gap.

The findings at a glance

All three findings concern PFMI Principle 15's LNAFE provisions, tested across the KC-level structure, the minimum calculation method, and the Basel equity carve-out qualifier, the provisions Treasury teams are most likely to rely on when calibrating buffers or drafting policy.

#Finding titleTypeCitation ID
1KC3 Basel equity carve-out condition fabricatedHallucinationRLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q002
2LNAFE minimum recast as non-existent greater-of floorHallucinationRLB-F-INT-BIS-CPMI-IOSCO-PFMI-L3-GENERAL-BUSINESS-RISK-2025-Q003

Aggregate impact

The three findings cluster tightly on a single sub-section of the standard: PFMI Principle 15, Key Considerations 2 and 3, and specifically the LNAFE framework. This is not a random scatter of errors, it reflects a systematic failure by AI tools to maintain the structural boundary between KC2 (scenario-based sizing) and KC3 (the quantitative floor and equity qualification rules). In every tested question where that distinction mattered, AI tools conflated or reassigned the provisions, producing answers that are internally plausible but wrong at the KC level.

The practical consequence for a Treasury function is that the most operational layer of Principle 15, the part that translates into a number to hold, an instrument to count, and a condition to satisfy, is exactly where AI tools are unreliable. Finding 2 shows an AI inventing a "greater of" dual-track structure for KC3 that imports KC2's scenario analysis leg as a co-equal minimum. Finding 3 shows a different AI placing the six-month floor in KC2 entirely, treating KC3 as solely a segregation clause.

Finding 1 shows AI tools either replacing KC3's actual Basel carve-out qualifier with a fabricated KC4 liquidity test, or denying the carve-out exists in KC3 at all. These are not edge interpretations, they are errors on the verbatim text of the standard.

For a Payment Institution operating across international markets, the systemic risk is that this cluster of errors survives internal review. A Treasury analyst querying AI on Principle 15 will receive a plausible, internally consistent answer that mislocates the floor, misstates the carve-out, or conflates the KCs, and a reviewer checking the output would need to go to the PFMI source text itself to catch it. Firms using AI-assisted regulatory mapping for Principle 15 should treat the entire KC2/KC3 intersection as a mandatory verification step, not an AI-safe zone.

What your team should do

The default position for Treasury teams using AI on PFMI Principle 15 is straightforward: AI output on KC-level attribution is unverified until cross-referenced against the published PFMI source text. The errors documented here are not in interpretive grey areas, they involve specific KC assignments, a specific quantitative minimum (six months of current operating expenses in KC3), and a specific qualifying condition for Basel equity inclusion. All are verifiable in under two minutes against the CPMI-IOSCO Principles for Financial Market Infrastructures (April 2012) and the November 2025 Level 3 assessment report.

Any Treasury policy, capital memo, or regulatory submission that cites KC2 or KC3 specifics should carry a verification flag until the KC text has been read directly.

The sharpest risk is in briefing-type queries, where a senior stakeholder or business line asks for a quick summary of what Principle 15 requires and Treasury responds with AI-generated output. The KC2/KC3 boundary confusion survives in a summary, gets quoted in a business case, and appears in a regulatory response before anyone checks the original. The structural safeguard is citation discipline: any internal output on Principle 15 KC-level detail should explicitly cite the paragraph of the PFMI, not a paraphrase, and should note whether it has been source-verified.

That discipline also protects the firm when a regulator requests the basis for a stated compliance position.

AI tools remain usable for orientation, identifying that Principle 15 addresses general business risk, that LNAFE is the operative concept, that the CPMI-IOSCO Level 3 process exists and published assessment findings in November 2025. At that level of abstraction the risk of a material error is lower. The danger zone is precision: as soon as a query asks for a specific number, a specific qualifying condition, or a specific KC assignment, treat the AI response as a hypothesis that requires source verification before it enters any work product.

How RLB Can Help

RegLeg's published Hallucination Research is a practical pre-flight check for any Treasury team running AI tools against regulatory questions. Before you rely on an AI-generated interpretation of safeguarding thresholds, capital buffer calculations, or cross-border settlement finality rules, the research tells you where those tools have already been caught fabricating or inverting the position.

For a Payment Institutions Treasury function, where a misread PSD2 derogation or a garbled EMD2 safeguarding ratio goes directly to a compliance breach, that's not an abstract risk catalogue; it's a live read on which questions you should be routing to the tool and which ones you should not.

Beyond the published findings, RLB works with Treasury teams on regulator-specific deep-dives that map AI-supported workflows to their actual hallucination exposure. Payment Institutions running multi-jurisdiction operations carry a specific pattern: the AI tools tend to perform well on home-jurisdiction rules they've seen repeatedly in training and degrade sharply on host-country PSD2 transpositions, local safeguarding equivalents, and FX settlement window rules that vary by corridor.

We can scope that exposure systematically, by workflow, by jurisdiction, by the regulatory surface area your team actually touches, so you know precisely where to insert human review rather than blanketing everything or missing the real gaps.

If your firm already has an AI-use policy in place, RLB can run a confidential review against our failure-mode catalogue, flagging where the policy's permitted-use boundaries don't account for the specific failure patterns we've documented in payment regulation contexts, and producing a prioritised remediation list your team can work through. We can also build that work into CPD-aligned training material, structured around the actual Treasury workflows (liquidity reporting, safeguarding reconciliations, regulatory capital monitoring) rather than generic AI literacy content, so your team has a defensible, documented basis for how it uses and quality-controls AI output in regulated processes.

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.