AI Hallucination ResearchAudiencesSectorsInternational / MultilateralCorporate BankingOperations › Promoting the Harmonisation of Application Programming Interfaces to Enhance Cross-Border Payments: Recommendations and Toolkit
Corporate Banking × Operations — International / Multilateral · updated 2026-06-04 · methodology v2.3
Share / Print Twitter LinkedIn Email

AI on CPMI API Harmonisation Cross-Border Payments for Operations teams at Corporate Banking firms in international jurisdictions

Executive Summary

Operations teams at corporate banking firms running cross-border payment infrastructure programmes are increasingly leaning on AI tools to accelerate their understanding of CPMI's API harmonisation framework — scoping implementation gaps, mapping toolkit criteria, and tracking successive document revisions. Across the two questions tested against this regulation, AI tools produced hallucinations in both cases, with zero reliable answers.

The failures split into two distinct shapes: one where the AI confidently fabricated a complete self-assessment toolkit structure that has no basis in any public source, and one where the AI misstated a document's publication date and invented specific technical annex contents, self-retracting only when pressed to verify sources. Both errors fall in the category of wrong deliverable — meaning work product built on either answer would be materially incorrect from the first line.

How AI gets this regulation wrong

The failures on this regulation split cleanly between two modes: confident fabrication of content that doesn't exist in any accessible source, and invented factual specifics — dates, technical structures, data entity breakdowns — that fall apart when the AI is pushed to cite its sources. Both modes are operationally dangerous precisely because the outputs look like informed answers: structured, detailed, and delivered without hedging until challenged.

AI's Failure ModeCountAffected findings
Exposed Fabrication1Finding#1
Misstated Rule1Finding#2

What that means for your team

Every error identified on this regulation puts the team at risk of producing the wrong deliverable — whether that is an internal gap assessment built against fabricated criteria, a technical specification mapped to non-existent data entity structures, or an implementation timeline anchored to an incorrect document publication date. The practical risk is not ambiguity at the margins: it is a core work product that is incorrect from the ground up and that downstream teams — technology, compliance, correspondent banking counterparties — will act on.

Risk ImpactCountAffected findings
Wrong deliverable2Finding#1 · Finding#2

When this affects your department

Operations teams reach for AI tools on this regulation in several recognisable moments: scoping a readiness assessment ahead of a programme board presentation, briefing the technology architecture team on what the self-assessment toolkit actually requires, or doing a rapid-turn horizon scan when the ISO 20022 data requirements are updated and the team needs to understand what changed since the prior version.

The CPMI API harmonisation framework is precisely the kind of multi-document, iteratively-revised corpus where AI appears most useful — it spans a main report, a toolkit companion, ISO 20022 data requirement publications, and subsequent updates, making manual source-chasing genuinely time-consuming.

The exposure surfaces the moment a team member uses an AI-generated summary of the self-assessment toolkit as the basis for an internal gap analysis. If the four-area structure and assessment dimensions the AI produces are fabricated — as our testing found — every gap identified, every control mapped, and every remediation action prioritised in that analysis is working from a false schema. That deliverable may travel upward into steering committee papers, outward into programme milestone reporting, or laterally into vendor RFPs specifying which API capability areas need to be assessed. Each downstream handoff compounds the error.

The version-tracking risk is equally sharp. When operations teams track successive CPMI document releases to keep implementation specifications current, an incorrect publication date for the updated ISO 20022 data requirements document is not just a filing error — it creates a version-control gap in any system build or data mapping exercise that references the document by date. If the technical annex contents are also fabricated, the data entities and structures the team designs against will diverge from the actual standard.

In correspondent banking programmes where counterparty API alignment depends on shared adherence to CPMI's harmonised data model, that divergence has direct operational consequences for message formatting, exception handling, and reconciliation workflows.

The findings at a glance

The two findings below cover the CPMI self-assessment toolkit structure and the updated ISO 20022 data requirements document — the two most operationally consequential questions an operations team would put to AI when scoping implementation work under this framework.

#Finding titleTypeCitation ID
1Fabricated self-assessment toolkit structureHallucinationRLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q005
2Wrong publication date and invented technical annex contentHallucinationRLB-F-INT-BIS-CPMI-API-HARMONISATION-CROSS-BORDER-2024-Q009

Aggregate impact

Both failures on this regulation share the same root condition: the underlying source documents are either not publicly extractable or have been published too recently for AI training data to have absorbed them accurately. The self-assessment toolkit is described on the CPMI landing page as a companion to the main report, but its internal structure — the assessment areas, criteria, and usage process — is not described in any publicly accessible source. AI tools filling that gap do not acknowledge the absence; they construct plausible-sounding detail and present it as confirmed.

The updated ISO 20022 data requirements document was published in February 2026, and when AI tools responded with incorrect publication dates and fabricated technical annex structures, they were, in at least one case, relying on a third-party aggregator article rather than the primary BIS source.

The pattern matters for operations teams because both questions are the type of question a team lead would assign to a junior analyst with the expectation of a reliable structured answer. The toolkit question in particular — what does it assess, how is it structured, what do we need to do to use it — is precisely the kind of foundational scoping task where a team would not expect to need deep verification. The AI's confident, structured output reinforces that expectation.

The analyst delivers a gap analysis framework; the team lead uses it to scope the programme; the error propagates before anyone checks it against primary source material.

The systemic exposure for corporate banking operations teams is that this regulation sits at the intersection of technology implementation, ISO standards compliance, and correspondent banking coordination — three functions that will each act on the same source material independently. A fabricated toolkit structure does not stay in one team's gap analysis; it migrates into technology architecture assessments, vendor evaluation criteria, and bilateral API harmonisation discussions with counterparties. Unwinding that kind of distributed error is far more costly than the initial analysis that introduced it.

What your team should do

The default position for operations teams using AI on this regulation should be: treat any AI-generated description of the self-assessment toolkit's internal structure as unverified until you have read the toolkit document directly. The CPMI landing page confirms the toolkit exists; it does not describe what it contains. Any AI response that provides a structured breakdown of assessment areas, dimensions, or usage steps is filling a gap that the public record does not fill. Do not build a gap analysis, a readiness matrix, or a board briefing from that answer without first obtaining and reading the source document.

For the ISO 20022 data requirements and subsequent updates, the practical safeguard is to anchor all version references to the primary BIS publication page rather than to AI-generated summaries or third-party aggregator articles. When the team needs to understand what changed between a prior and updated version of a CPMI document, direct comparison of the two primary documents is the only reliable method. AI tools have demonstrated a willingness to attribute technical specifics — data entity structures, annex contents, publication dates — to fabricated or third-party sources, and to self-retract only when challenged.

That self-retraction is useful if a team member pushes back, but it is not a safeguard if the first answer goes straight into a technical specification.

Where AI tools are safe on this regulation: summarising the high-level structure and objectives of the main report, which is a publicly accessible and well-indexed document; drafting internal communications that explain the framework's purpose to business line stakeholders; and generating the skeleton of a gap analysis template — provided the assessment criteria are then populated from primary source documents rather than from AI-generated descriptions. The toolkit and the technical annexes to successive ISO 20022 updates are the two areas to handle with direct source verification, not AI delegation.

How RLB Can Help

RegLeg's published Hallucination Research gives Operations teams a pre-flight check before they commit to any AI-assisted output on regulatory questions — correspondent banking eligibility, settlement cut-off interpretation, capital threshold calculations, cross-border payment screening rules. The research is regulation-specific and publicly available: if your team is leaning on AI tools to navigate a particular instrument, check whether that regulation is already in the corpus. A documented failure mode on the instrument you're relying on is material to whether you run with the output or escalate for manual review.

Beyond the public findings, we run bespoke regulator deep-dives scoped to the workflows where Corporate Banking Operations teams carry the highest hallucination exposure. That means mapping your specific AI touchpoints — transaction monitoring parameterisation, nostro reconciliation guidance, SWIFT compliance checks, correspondent due diligence — against the failure patterns we've catalogued for the jurisdictions and regulatory bodies relevant to your book.

The output is a ranked exposure register your team can use directly: which use cases are defensible, which carry unquantified model risk, and where the gap between what the AI asserts and what the regulation actually says is wide enough to create a control failure.

We also work confidentially with Operations and Compliance leads to review existing AI-use policies against our failure-mode catalogue, producing a prioritised remediation list with the specificity your senior management and audit functions will expect — not a generic AI-governance framework, but a line-by-line assessment of where your current policy under-addresses known failure patterns in your regulatory universe. For teams building or refreshing internal training, we can provide CPD-aligned material that translates the research findings into practical calibration guidance for the staff who are closest to the AI tools day-to-day.