AI Hallucination on Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments - Updated Report for Product & Business Development teams at Retail Banking firms in international jurisdictions

Executive Summary

Product & Business Development teams at Retail Banking firms operating across international payment corridors are directly exposed to CPMI's Harmonised ISO 20022 Data Requirements for Enhancing Cross-Border Payments, the standard that defines how enriched data fields must be structured in cross-border payment messages and sets the adoption benchmarks that regulators and correspondent banking partners increasingly use to assess readiness.

In testing AI tools against this regulation, we found that AI assistants produced materially incorrect figures on questions where the correct answer is a policy-consequential metric, specifically, the relative adoption rates of ISO 20022 across faster payment systems and RTGS systems globally. The failure was not a minor rounding error: the AI collapsed two distinct adoption statistics into a single conflated figure, systematically misrepresenting the gap between FPS and RTGS adoption in a way that would distort any internal assessment, board update, or product roadmap built on that answer.

One aggregated finding emerged, classified as a hallucination, the AI delivered a confident, specific answer that was demonstrably wrong, then acknowledged the figure was reconstructed when pressed.

How AI gets this regulation wrong

The failure pattern on this regulation is a specific form of confident fabrication: AI tools present a single, authoritative-sounding percentage where the source actually reports two materially different rates for two different system types. When challenged, the AI concedes the figure was assembled from memory rather than drawn from a specific authoritative source, but the original confident delivery means teams rarely push back far enough to trigger that admission.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	2	Finding#1 · Finding#2

What that means for your team

For a Product & Business Development team at a Retail Banking firm, this failure lands squarely in the category of wrong deliverable: any strategy paper, board briefing, regulatory submission, or correspondent banking due-diligence pack that incorporates the AI's conflated adoption figure carries forward a factually inaccurate market baseline. The downstream consequences range from mispriced product timelines to regulatory credibility damage when the firm's stated understanding of the global ISO 20022 migration landscape contradicts what CPMI's own monitoring data shows.

Risk Impact	Count	Affected findings
Wrong deliverable	2	Finding#1 · Finding#2

When this affects your department

Product & Business Development teams reach for AI tools on CPMI's ISO 20022 harmonisation work in a predictable set of contexts: drafting the payments-infrastructure section of a new cross-border product feasibility study; building regulatory-context slides for a board or ALCO update on the firm's ISO 20022 migration readiness; scoping the correspondent banking due-diligence framework for a new payments corridor; or responding to a business line query about how far along the market is and whether the firm's timeline is competitive.

In all of these workflows, the CPMI adoption statistics are load-bearing, they anchor the "where is the market right now" baseline from which the firm's own positioning is derived.

The specific failure documented here, collapsing the faster payment system adoption rate and the RTGS adoption rate into a single blended figure, introduces a directional error that is not self-correcting. Faster payment systems are materially further along the ISO 20022 migration curve than RTGS systems; if your product team believes both are at roughly the same level, you will systematically underestimate the urgency of RTGS-facing infrastructure decisions and simultaneously overstate the firm's relative lag (or lead) in the FPS space.

A product roadmap built on that misread will be out of step with where correspondent banking partners and correspondent infrastructure providers actually sit.

The regulatory credibility risk is equally concrete. If the firm's regulatory mapping documentation, product approval submission, or response to a supervisory request cites the AI's conflated figure, the discrepancy with CPMI's published monitoring data is verifiable by any reviewer who reads the source. That is the kind of error that signals to regulators and auditors that the firm's AI-assisted compliance work is not being sense-checked against primary sources, with implications that extend well beyond this single figure.

The findings at a glance

One question, one AI failure, but the finding cuts to the core of how Product & Business Development teams benchmark the ISO 20022 migration landscape when building product strategy and regulatory submissions.

#	Finding title	Type	Citation ID
1	ISO 20022 adoption rate conflation: RTGS vs faster payments (Opus 4.7)	Hallucination	RLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Opus47
2	ISO 20022 adoption rate conflation: RTGS vs faster payments (Sonnet 4.6)	Hallucination	RLB-F-INT-BIS-CPMI-ISO-20022-HARMONISATION-UPDATED-2026-Q006-Sonnet46

Aggregate impact

With one finding in this cell, the failure pattern is concentrated rather than diffuse, but it is concentrated on precisely the metric that Product & Business Development teams are most likely to use as a briefing anchor.

The CPMI's ISO 20022 harmonisation work is notable for having a published, speech-level quantification of adoption progress: "more than three-quarters of faster payment systems and approaching half of RTGS systems now use ISO 20022." These are two distinct numbers with a material gap between them, and that gap carries strategic content, it tells the industry that FPS migration is largely done while RTGS is still in mid-cycle. AI tools tested on this regulation collapsed that distinction into a single figure and applied it uniformly to both system types.

For a Retail Banking firm's Product & Business Development function operating across international corridors, the asymmetry between FPS and RTGS adoption is not an abstract data point, it shapes decisions about where to accelerate internal investment, which correspondent relationships need a readiness conversation, and how to characterise the firm's competitive positioning in product materials or investor communications. A team that trusts the AI's blended figure will be working from a model of the market that simply does not exist: one in which both RTGS and fast payment infrastructure are at roughly equivalent stages of ISO 20022 readiness.

That misread propagates into strategy documents, capability assessments, and product approval packs in ways that are difficult to catch once the numbers have been cited.

The systemic risk is compounded by the AI's behaviour under challenge: rather than maintaining the error, it admitted, when pressed, that the specific figure had been reconstructed rather than sourced from a primary publication. That is a clear signal that the confident initial delivery was not backed by verified knowledge. In a fast-moving product development cycle, junior team members are unlikely to probe hard enough to surface that admission; the figure passes unchallenged into the first draft and typically survives review.

What your team should do

The default position on any CPMI ISO 20022 adoption metric should be primary-source verification before the figure enters a draft. CPMI documents the monitoring data directly; the adoption percentages for faster payment systems and RTGS systems are available in speeches and reports from the BIS, and they are specific, stable, and attributable. If your team is using AI to pull these numbers for a product strategy document or regulatory filing, treat the output as a starting hypothesis rather than a citable fact, and build the verification step into the workflow before review, not as a retrospective check.

The practical safeguard for this regulation is straightforward: create a short standing reference for the current CPMI adoption figures (FPS rate and RTGS rate separately, sourced and dated) that the team can use as a sanity-check against any AI-generated briefing material. Given how frequently these figures appear in cross-border payments strategy work, the cost of maintaining that reference is minimal relative to the cost of a mis-cited metric surfacing in a board paper or regulatory submission.

Any junior team member who pulls an AI summary touching ISO 20022 adoption should cross-reference the two rates against that reference before the material moves forward.

Where AI tools are genuinely useful in this regulation's workflow: drafting narrative framing around a known, verified baseline (the actual CPMI figures supplied by the team); summarising the scope and structure of the harmonised data requirements where the technical content is not the issue; or helping structure a gap analysis once the quantitative benchmarks have been independently confirmed. The failure documented here is specific to quantitative adoption metrics, it does not mean AI assistance on this regulation is uniformly unreliable, but it does mean the numbers need to be owned by the team, not delegated to the tool.

How RLB Can Help

RegLeg's published Hallucination Research is available as a free pre-flight reference before your team acts on AI-assisted regulatory analysis, product eligibility reads, cross-border licensing assessments, conduct-risk scoping for new features. The findings index which regulatory texts, jurisdictions, and question types have already produced documented AI failures in controlled testing.

If your team is relying on AI output to inform a product approval decision or a market-entry call and the relevant regulation appears in the catalogue, that is the moment to cross-check: not to abandon the workflow, but to know which outputs carry weight and which need a qualified second read before they reach a credit committee or a regulator.

For Product & Business Development specifically, the highest-exposure workflows tend to cluster around the same fault lines the research surfaces repeatedly: eligibility thresholds stated with false precision, exemption conditions that AI tools assert confidently but get wrong in detail, and jurisdiction-specific carve-outs that were amended after the model's training horizon. RegLeg can run a bespoke deep-dive against the regulator set that matters to your firm's product roadmap, mapping where AI assistance is low-risk in your workflows and where the documented failure modes should change how your team stages that reliance.

The output is scoped to your regulatory perimeter, not a generic risk list.

If your firm already has an AI-use policy in place for product development and regulatory research functions, RegLeg can review it against the failure-mode catalogue under confidentiality and return a prioritised remediation view: which policy provisions are substantively protective, which create false confidence, and where the gaps sit relative to the failure patterns the research has documented. We can also develop training material and CPD-aligned content calibrated to Product & Business Development workflows, not generic AI-literacy content, but material that treats your team's existing regulatory and commercial expertise as the baseline and builds on it.