Sonnet opens up the hallucination geography encoded in the CPMI ISO 20022 harmonisation standard.
— RLB Specialist Panel
SINGAPORE, June 13, 2026. Two frontier artificial-intelligence models generated reconstructions of the CPMI Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments that blend disaggregated adoption statistics, over-specify a published technical schema, and misattribute institutional roles the regulator's record places elsewhere, according to a white paper released today by RegLeg Brief, a regulatory-research outfit operated by Singapore-incorporated Verdus Technologies Pte. Ltd.
The findings concern the CPMI Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments, Updated Report, which operationalises the messaging architecture for the G20 cross-border payments roadmap and binds correspondent banks, payment scheme operators, real-time gross settlement systems, and implementing bodies such as Fedwire to a common data model. Both Anthropic's Claude Opus 4.7 and Claude Sonnet 4.6 were tested with web search active, mirroring how correspondent-bank compliance teams, payment-scheme operators, fintech integrators, and regtech tools actually use the models when scoping implementation timelines, drafting integration specifications, or briefing senior management on the cross-border payments roadmap.
The regulator's record on ISO 20022 adoption is published not in a structured table inside the primary report but in an official speech delivered by Bank of England Governor Andrew Bailey on 12 March 2026. The verbatim passage reads:
"More than three quarters of faster payment systems covered by the survey, and approaching half of RTGS systems, are now using ISO 20022."
The structural register matters. The regulator characterises the two system types with distinct figures, faster payment systems "more than three quarters", RTGS systems "approaching half", because the adoption curve has moved materially faster on faster payment infrastructure than on legacy RTGS rails. A correspondent bank scoping implementation readiness against the composite would budget against an inflated RTGS baseline; a payment-scheme operator briefing a board on adoption momentum would understate the faster-payments lead.
The regulator's record on the operational baseline that justifies harmonisation is in the same speech channel:
Asked what share of faster payment systems and RTGS systems currently use ISO 20022 messaging, Claude Opus 4.7 (with web search on) wrote, verbatim:
"approximately 79% of both real-time gross settlement (RTGS) systems and fast payment systems (FPS) had either already implemented ISO 20022 or had concrete plans to do so"
The structural error. The model collapsed two distinct figures into a single symmetric percentage applied to both system types. The 79% figure matches neither the faster-payment record ("more than three quarters") nor the RTGS record ("approaching half"); it is an internally-reconstructed composite. A compliance professional reading the output would record the two system types as adopting at the same rate and brief senior management against an inflated RTGS baseline.
The schema over-specification. Asked when Fedwire implemented the CPMI harmonised data model and what postal address format Fedwire requires for the hybrid/end-state approach, Opus 4.7 wrote that "structured Town Name and Country code are mandatory, with optional structured elements (e.g., Street Name, Building Number, Post Code, Country Sub-Division) plus a limited Address Line element for residual unstructured content." The FRB Services FAQ prescribes a leaner mandatory tier: country code plus town name plus optional free-format lines of 70 characters each.
The model elevated Building Number, Post Code, and Country Sub-Division into a more rigid structured-element category the implementing body's own FAQ does not require.
Why it failed. The RegLeg Brief Specialist Panel writes that "subcategory-numeric conflation surfaces precisely where the regulator documents distinct figures through a channel, in this case a BIS-hosted official speech, that the model's retrieval ranker indexes at lower density than the primary publication. The 79% composite is the shape an averaging pass over the speech text would produce if the two figures were collapsed before extraction.
The Fedwire over-specification has the same shape on the schema side: where the implementing-body FAQ is not retrieved with sufficient weight, the model reconstructs the mandatory/optional boundary from training, and reconstruction skews toward greater structure than the regulator requires."
The failure modes are classified as inference_drift against substrate documents R6-REPORT_CHAPTER-iso20022-data-requirements-updated.pdf and the FRB Services Fedwire FAQ.
Given the same adoption-rate brief, Sonnet 4.6 (with web search on) wrote:
"79% of both real-time gross settlement (RTGS) systems and fast payment systems (FPS) either have already implemented ISO 20022 or have concrete plans to do so"
The model also produced a fabricated third-party citation pointing to a centralbanking.com URL that does not contain the figure attributed to it.
The attribution error. Asked which central bank chairs the CPMI working group responsible for ISO 20022 harmonisation, Sonnet 4.6 named the Federal Reserve Bank of New York. The CPMI workstream co-chair is the Reserve Bank of Australia; the RBA Governor served as Co-Chair. The model offered an explicit "based on available public sources" hedge alongside the answer, but the hedge did not prevent a confidently-wrong attribution to a higher-frequency major-market institution. A payment-scheme operator pursuing standards-governance engagement on the basis of the output would approach the wrong central bank.
The false-negative evasion. Asked what official statistics exist on inquiry rates and manual touchpoints in cross-border payments, Sonnet 4.6 returned a false negative, claiming no specific official statistic was available. The 12 March 2026 Bailey speech gives the figures explicitly: 1 to 3 per cent of payments generate inquiries, requiring 5 to 10 manual touchpoints, with resolution times reducible by up to 80 per cent through harmonisation. The model retrieved nothing from the speech.
The failure modes are classified as misattributed and inference_drift against the same substrate.
The CPMI ISO 20022 findings sit inside a broader failure class the RegLeg Brief Specialist Panel has been documenting across payments-infrastructure and standard-setter work, which it calls Numeric Conflation and Channel-Index Drift, frontier models systematically collapsing disaggregated subcategory statistics into single composite figures, and falling back on internally-reconstructed content when the regulator's record is delivered through speech, working-group press release, or implementing-body FAQ channels rather than the primary publication body.
The white paper documents the pattern across the audited question set:
A correspondent bank, payment-scheme operator, fintech integrator, or regtech tool automating implementation-readiness scoping or counterparty-engagement planning on either model would carry the blended adoption rate, the wrong central-bank attribution, the over-specified Fedwire schema, and a missing operational baseline into the artefacts the firm produces.
Both Claude outputs shared the same surface characteristics, structured percentages, regulator-attributed phrasing, internal coherence, and in Sonnet 4.6's case, an explicit hedge that read as a calibration signal. The white paper states the operational risk plainly:
"The failure is not recoverable by the user in real-time: the model's output reads as a faithful summary of the regulator's monitoring data, and validation would only happen if the reader already knew that the regulator's faster-payment and RTGS adoption rates diverge by roughly thirty percentage points and that the 79% composite matches neither."
Correspondent-bank compliance teams, payment-scheme operators, fintech integrators, and regtech tools advising on cross-border implementation timelines are the population most exposed. They use AI assistants to summarise CPMI publications, draft integration specifications, and structure board-paper briefings against tight roadmap deadlines, the exact workflow in which the failure surfaces.
The RegLeg Brief Specialist Panel documents five red-team probe designs in the white paper that any AI lab or alignment team can run against their own models with no commercial engagement required:
RegLeg Brief operates as a completely ungated, open-access public resource. The white papers, per-finding cards, regulator verbatim excerpts, RLB Citation IDs, methodology notes and supporting data logs are all published without paywalls, registration walls, or data-licensing fees. By documenting original regulatory research without financial or distribution barriers, the platform ensures that:
Because RegLeg Brief conducts its own original research and adversarial analysis against frontier AI models, the detail in each published finding is precise enough to enable AI labs to take targeted hallucination-mitigation measures. Directions an AI lab might consider, drawing on the published findings, include:
AI labs and model developers named in any published finding have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.
Primary source verified: CPMI Harmonised ISO 20022 Data Requirements for Cross-Border Payments (2026 Update) · Substrate documents: p_09_OTHER_Governance___which_institution_chaired_t_brief10.htm, p_10_SPEECH_ISO_20022_adoption_rates_across_payment_r260316d.htm, p_11_SPEECH_Cross_border_payment_inquiry_rates_and_e_r260316f.htm, p_17_NOTICE_Fedwire_Funds_Service_implementation_dat_mandating-iso-20022-enhanced-data-in-chaps.html · CPMI portal: bis.org/cpmi
Citation IDs referenced:
RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q004-Sonnet46RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Opus47RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Sonnet46RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q007-Sonnet46RLB-H-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q010-Opus47For AI Labs