AI Hallucination ResearchAudiencesSectorsInternational / MultilateralStatutory Boards & AgenciesCompliance › Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments - Updated Report
Statutory Boards & Agencies × Compliance — International / Multilateral · updated 2026-06-11 · methodology v2.3
Share / Print Twitter LinkedIn Email

AI Hallucination on Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments - Updated Report for Compliance teams at Statutory Boards & Agencies firms in international jurisdictions

Statutory Boards & Agencies Compliance teams: documentation and reporting gaps possible from AI reading of CPMI ISO 20022 Harmonisation (2026 update)

Compliance teams at Statutory Boards & Agencies responsible for payments infrastructure exposure to the CPMI Harmonised ISO 20022 Data Requirements (Updated Report) are increasingly using AI to draft formal regulatory submissions to central banks or supranational bodies, generate board papers on payments infrastructure benchmarks, and prepare gap-analysis documents against peer adoption rates. The same tools validate citation accuracy in finance-ministry-facing briefings.

Two frontier AI models tested by the RLB Specialist Panel on the workflows statutory-agency compliance officers use to support advice on the CPMI Harmonised ISO 20022 Data Requirements (Updated Report) produced two discrete hallucinations bound to regulator-issued source text. The Panel records a single recurring failure class: Numeric Drift across the set. Questions were prepared by the Specialist Panel based on real practical AI usage in the workflows statutory-agency compliance officers use AI for, and each finding is bound to verbatim regulator-issued source text held as primary substrate.

For Compliance teams at Statutory Boards & Agencies, each hallucination has a direct operational consequence in the regulatory submission, gap-analysis document, or finance-ministry briefing. The Panel's testing surfaces ISO 20022 adoption rate conflation (RTGS vs faster payments), and ISO 20022 adoption rate conflation (RTGS vs faster payments). Where these errors flow into a deliverable, the exposure is a credibility-damaging factual discrepancy in a formal submission, a forced retraction or amended filing, and a misstated baseline for gap analysis presented to the governing board.

The pattern is uniform across the set: the AI returns a confident, sourced-looking answer that conflicts in a load-bearing specific with the regulator's verbatim text, and the error survives a first-pass review precisely because the surface form is plausible. The Panel records each hallucination with the regulator's primary substrate held as the anchor, so the corrective text is available alongside the failure.

The Specialist Panel records the citation IDs as follows: RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Opus47 (Claude Opus 4.7 (web search on), Numeric Drift); RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Sonnet46 (Claude Sonnet 4.6 (web search on), Numeric Drift). Each citation links to the verbatim regulator-issued source text, the tested AI question, and the recorded AI response, so the Panel's assessment is traceable end to end. For compliance teams at statutory boards & agencies, the citation IDs operate as a reference index: when an AI answer in the working draft matches a known Panel finding, the cited regulator text is already available as the corrective anchor.

The full per-finding analysis cards, including the audience-specific impact statement, sit on the cell's detail surface for sign-off use.

Executive Summary

Compliance teams at Statutory Boards & Agencies firms in international jurisdictions rely on accurate ISO 20022 adoption data to benchmark their own implementation progress, advise on interoperability obligations, and report accurately to governing boards and finance ministers on where the global payments ecosystem stands. On the CPMI Harmonised ISO 20022 Data Requirements Updated Report, AI tools we tested produced statistically wrong answers on the most-referenced monitoring datapoint in the regulation, the share of faster payment systems and RTGS systems that have adopted ISO 20022.

The failure was not a marginal rounding error: AI tools collapsed two materially different figures (faster payment systems at over three-quarters; RTGS systems at approaching half) into a single inflated number applied uniformly to both system types, overstating RTGS adoption by roughly 30 percentage points. When pressed on the discrepancy, the AI tools acknowledged uncertainty, confirming the confident initial answer had no reliable grounding. A Compliance function that built a policy brief, board paper, or regulator submission around the AI's figure would be documenting a factual misrepresentation of the current state of ISO 20022 deployment in the RTGS segment.

How AI gets this regulation wrong

The failure mode surfaced on this regulation is confident fabrication: AI tools produced a specific, citable-looking statistic with no hesitation, then retreated from it when challenged. The table below maps how this pattern manifests, where the AI's answer was internally plausible but materially wrong, and where the error would survive a quick read without triggering a red flag.

AI's Failure ModeCountAffected findings
Exposed Fabrication2Finding#1 · Finding#2

What that means for your team

For Compliance functions at Statutory Boards & Agencies firms in international jurisdictions, the risk here concentrates in the wrong-deliverable category: the AI's answer was plausible enough to flow into a board paper, regulatory submission, or internal gap analysis before the error was caught. The table below breaks down what that means in practice, which workflow carries the error forward and what it costs the firm to correct it downstream.

Risk ImpactCountAffected findings
Wrong deliverable2Finding#1 · Finding#2

When this affects your department

Compliance teams at Statutory Boards & Agencies firms in international jurisdictions consult AI tools on this regulation most heavily when they are assembling briefing materials for a governing board or finance ministry, scoping an internal implementation roadmap against where the global ecosystem sits, or responding to a business line asking whether their payment infrastructure is compliant with the direction of travel CPMI has set. Adoption-rate statistics sit at the heart of all three use cases, they frame the urgency of the firm's own migration, calibrate peer-benchmarking arguments, and anchor the regulatory context in any board paper or external communication.

The specific failure here, overstating RTGS adoption by roughly 30 percentage points, would misrepresent the competitive and regulatory landscape in exactly the documents that carry the most institutional weight. A board paper asserting that close to 80 percent of RTGS systems have adopted ISO 20022 will frame the firm's own timeline as a catch-up exercise when, on the RTGS side, the actual picture is closer to a half-adoption market. That framing affects capital allocation decisions, vendor contract timelines, and the strength of the business case for accelerating internal migration.

For a firm that interfaces with a central bank, a finance ministry, or a supranational body, all normal counterparties for a Statutory Boards & Agencies compliance function, presenting an inflated RTGS adoption figure in a formal submission or regulatory dialogue creates a specific credibility risk. Counterparties working from the same CPMI monitoring data will immediately identify the discrepancy, and the reputational damage of producing a demonstrably wrong statistic in a regulatory context is disproportionate to how easy the error would have been to prevent.

The findings at a glance

The table below summarises the finding tested against this regulation, the question asked, what AI tools answered, and how that diverges from what the CPMI monitoring data actually states.

#Finding titleTypeCitation ID
1ISO 20022 adoption rate conflation: RTGS vs faster payments (Opus 4.7)HallucinationRLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Opus47
2ISO 20022 adoption rate conflation: RTGS vs faster payments (Sonnet 4.6)HallucinationRLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Sonnet46

Aggregate impact

With a single finding in this cell, the pattern is clear and concentrated: the AI failure occurs on a quantitative regulatory datapoint where the source text makes a deliberate and material distinction between two payment system types. The authoritative CPMI monitoring survey, as reported by FSB officials, explicitly separates faster payment system adoption (more than three-quarters) from RTGS adoption (approaching half). AI tools collapsed that distinction into a single figure and applied it uniformly. The resulting answer was internally consistent and looked credible because it cited a real-sounding percentage, but it systematically misrepresented the RTGS segment.

This matters disproportionately for a Compliance function at a Statutory Boards & Agencies firm because RTGS infrastructure is often the firm's direct operational terrain. Regulatory benchmarking for a statutory body typically centres on the RTGS segment, not faster payment retail rails. An inflated RTGS adoption figure sets the wrong baseline for gap analysis, misrepresents the firm's position relative to peers, and generates an inaccurate urgency assessment for internal prioritisation.

The systemic risk is that this class of error, confident fabrication of a specific statistic, is uniquely hard to catch in a Compliance workflow that relies on AI for research efficiency. The answer looks well-sourced, uses the right terminology, and lands within a broadly plausible range. A junior analyst under time pressure has no obvious signal to pause on. The correction only surfaced when the AI was directly challenged, a step that rarely happens when a stat is embedded in a longer draft rather than standing alone as the primary output.

What your team should do

The default position for any Compliance work that cites ISO 20022 adoption statistics should be to go to the primary CPMI monitoring survey output or the FSB Payments Summit speech record directly, not to use AI as the lookup layer. The CPMI monitoring data is publicly available and stable; the cost of pulling the actual figure is low, and the cost of carrying a wrong figure into a board paper or regulatory submission is not. Treat AI answers on quantitative regulatory monitoring data as a starting hypothesis that needs source verification before it reaches any deliverable.

For the specific workflow risk here, the practical safeguard is a house rule on AI-assisted drafting: any sentence that contains a percentage, adoption rate, implementation count, or survey-derived figure must have the primary source cited in the draft, not just attributed to "CPMI monitoring data." If a junior can't produce the URL or document section, the figure doesn't go in. That rule catches this class of fabrication before it reaches review, because the AI's figure, when traced, does not match the source text.

AI tools are genuinely useful on this regulation for tasks that don't depend on getting a specific number right: summarising the structure of the harmonised data requirements, mapping the LEI/IBAN/BIC field obligations against the firm's existing payment system data model, or drafting a gap-analysis framework for comparing current message formats against the target state. Those tasks leverage the AI's ability to synthesise structured regulatory text, where the failure mode is different and more detectable. Keep AI away from any work-product where a single wrong statistic would compromise the entire deliverable.

How RLB Can Help

RegLeg's published Hallucination Research gives Compliance teams at Statutory Boards and Agencies a practical pre-flight check before placing weight on AI-assisted output for regulatory questions. Because the research is openly available, it can be incorporated into existing review workflows without additional licensing or procurement, teams can consult the relevant failure-mode findings at the point where AI tools are being used to interpret obligations, draft submissions, or assess enforcement exposure, and adjust their reliance accordingly.

Where published research is not granular enough for a specific operating context, RLB offers bespoke regulator deep-dives tailored to the Compliance function's actual workflow. These engagements map the AI-supported tasks that carry the highest hallucination exposure for a Statutory Board or Agency, typically areas such as multi-jurisdictional obligation mapping, condition-of-licence interpretation, and regulatory correspondence drafting, and produce a prioritised picture of where human verification effort should be concentrated.

RLB also conducts confidential reviews of a firm's existing AI-use policy against RegLeg's failure-mode catalogue, identifying gaps and producing a prioritised remediation roadmap that the Compliance team can action within its normal governance cycle.

To support capability building within the team, RLB develops training material and CPD-aligned content that Compliance staff can use internally. This content is designed to be delivered by the team's own leads rather than requiring ongoing external facilitation, and is calibrated to the regulatory environment and AI tools already in use at the firm. The aim is to leave the Compliance function better equipped to make its own informed judgements about AI reliability, not dependent on external sign-off each time a new workflow is introduced.

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.