AI Labs · published 2026-05-26 · methodology v2.1

Hallucination in Regulatory AI: CPMI-IOSCO Cyber Resilience Guidance (2016) — Findings for AI Labs

This report documents hallucinations produced by frontier AI models when asked questions about the Guidance on Cyber Resilience for Financial Market Infrastructures, published in June 2016 by the Committee on Payments and Market Infrastructures (CPMI) and the International Organization of Securities Commissions (IOSCO) under the auspices of the Bank for International Settlements. Two model configurations were evaluated: Claude Opus 4.7 with web search and Claude Sonnet 4.6 with web search. Across seven findings, both models exhibited a consistent pattern of treating post-2016 regulatory and policy developments as if they were already incorporated into the 2016 guidance, and of asserting the continued operative status of a document that — as of May 2026 — is under active revision. The errors are notable not because they are implausible but because they are confident, structurally coherent, and grounded in related real-world sources — precisely the conditions under which compliance professionals and legal teams are most likely to act on them without verification.

When this affects AI Labs

The CPMI-IOSCO Cyber Resilience Guidance is a live operational reference for central banks, payment system operators, central counterparties, and securities settlement systems across more than 100 jurisdictions. Compliance officers, legal counsel, and technology risk teams at financial market infrastructures regularly consult this guidance when designing cyber resilience frameworks, responding to supervisory enquiries, or preparing board-level risk disclosures. When a user in any of these roles asks an AI assistant about the operative requirements, the publication timeline of related standards, or whether the guidance has been updated, a confidently wrong answer is not a curiosity — it is a latent liability. Labs whose models are deployed in legal-research, regulatory-intelligence, or risk-advisory products face direct downstream exposure if those products return materially incorrect regulatory status information.

The failure modes documented here are of a specific and commercially significant type: the models do not confabulate from nothing. Instead, they reconstruct a plausible regulatory picture by blending the 2016 document's content with later publications (FSB Cyber Lexicon, 2018; FSB Incident Response Guidance, 2020; CPMI endpoint-security strategy, 2018) that postdate it, presenting the composite as if it described the original text. This is harder for an end-user to detect than a fabricated citation, because the later documents are real, the alignment between them and the 2016 guidance is broadly genuine, and the error lies in the temporal and attributional logic rather than in invented content. Models that pass generic hallucination red-teaming may still fail on this class of document — one where the regulatory landscape has evolved continuously around a fixed anchor text.

The structural properties of this regulation compound the risk. The guidance is a dense, cross-referenced PDF whose five-category framework (Governance, Identification, Protection, Detection, Response and Recovery) bears surface similarity to several contemporaneous industry frameworks, including the NIST Cybersecurity Framework. That structural resemblance makes plausible-but-unverified alignment claims easy for a model to generate and hard for a non-specialist to refute. Additionally, the document's status changed materially in May 2026, when CPMI-IOSCO published a consultative draft of updated guidance — meaning any model whose knowledge does not extend to that date will assert outdated operative-status information with high confidence. For labs deploying models in financial-services contexts, this is a concrete and measurable eval gap.

Aggregate impact

Both models tested were run in web-search-enabled configurations, meaning retrieval was available to supplement training knowledge. Despite this, the dominant failure pattern across both models was not retrieval failure in the conventional sense — the models did not simply fail to find the document. Instead, they produced answers that blended the 2016 guidance with later developments in the CPMI-IOSCO-FSB regulatory ecosystem, presenting the composite picture as if it were an accurate description of the 2016 text alone. This suggests the web-search step is not effectively grounding temporal or attributional claims about a known regulatory document, even when the document is publicly accessible.

Per-model summary:

Claude Opus 4.7 with web search — 3 findings. The dominant pattern was cross-reference drift: the model attributed to the 2016 guidance content that belongs to related but later CPMI publications, including a 2018 endpoint-security strategy and a 2018 speech by a senior CPMI official. The model also asserted, with no qualification, that the 2016 guidance remains the operative international standard — an assertion that was incorrect as of May 2026 when CPMI-IOSCO published a consultative revision.
Claude Sonnet 4.6 with web search — 4 findings. The dominant pattern was affirmative fabrication: the model stated that the guidance explicitly cites the NIST Cybersecurity Framework (unconfirmed by the text), that it contains detailed operational practices for incident response comparable to a 2020 FSB publication (which postdates it by four years), and that the FSB Cyber Lexicon explicitly drew on the CPMI-IOSCO definition of cyber resilience (a causal claim for which no basis was found). Like Opus 4.7, it also asserted the unchanged operative status of the 2016 guidance.

The joint pattern is instructive for alignment teams. Both models with web search produced substantively incorrect answers about a document that is publicly available and has a well-defined publication history. The errors are not random: they cluster around two failure surfaces. The first is the relationship between the 2016 guidance and the post-2016 regulatory ecosystem that grew up around it — models consistently over-attribute the later ecosystem's content to the original document. The second is the current regulatory status of the guidance — both models failed to detect or surface the May 2026 revision consultation, defaulting instead to training-era assertions about the document's operative status. Together these suggest that web-search retrieval, at least as currently integrated, does not reliably override training-era priors when the question concerns a well-known document's content or status.

Findings

7 findings in this case study. Click any to see its full evidence card.

Claude Opus 4.7 with web search see this finding →
Claude Opus 4.7 with web search see this finding →
Claude Opus 4.7 with web search see this finding →
Claude Sonnet 4.6 with web search see this finding →
Claude Sonnet 4.6 with web search see this finding →
Claude Sonnet 4.6 with web search see this finding →
Claude Sonnet 4.6 with web search see this finding →

What your team should do

The findings here point to three concrete areas for your evals and post-training teams to examine. First, your eval suite for regulatory-document question-answering should include questions that probe temporal boundaries explicitly — not just "what does this document say" but "when was this document superseded, and by what." The CPMI-IOSCO cyber resilience guidance is a well-documented case where a model trained before May 2026 will confidently assert outdated operative-status information. Stress-testing web-search-enabled models against recent regulatory consultations and amendment announcements — particularly those issued by BIS, IOSCO, FSB, and national prudential regulators — would expose the same failure pattern systematically rather than anecdotally. If your models are deployed in any financial-services or regtech context, current regulatory status is a high-priority failure surface.

Second, the cross-document attribution errors in these findings warrant a targeted look at how your models handle a specific class of question: comparing or aligning two documents when one postdates the other. Both models collapsed the 2016-to-2018 gap when discussing the FSB Cyber Lexicon, treating the Lexicon as if it were a contemporaneous or co-developed document rather than a later standard. The same pattern appeared when Sonnet 4.6 was asked about operational practices for incident response, where it described the content of a 2020 FSB publication as if it were in the 2016 text. Synthetic training examples that reward explicit temporal framing — "Document A, published in year X, could not have incorporated Document B, published in year Y" — may help models develop a stronger prior for flagging anachronistic alignment claims.

Third, the NIST Cybersecurity Framework attribution error from Sonnet 4.6 illustrates a structural-similarity-to-citation conflation that is likely not unique to this regulation. Any regulation whose architecture mirrors a widely known framework (NIST CSF, ISO 27001, COBIT) is a candidate for the same pattern. Red-team probes that ask whether a specific regulatory document "explicitly references" a named industry framework — and that penalise affirmative answers that cannot be grounded in the document text — would build more reliable behaviour in legal and compliance retrieval contexts. For documents with publicly available PDFs, your retrieval stack should be verifiable against the source: if the model asserts an explicit citation, the citation should be locatable.

How RLB can help

RegLeg maintains a structured question bank across a portfolio of international and national regulatory instruments, including BIS-CPMI, IOSCO, FSB, and major prudential frameworks. For the CPMI-IOSCO Cyber Resilience Guidance specifically, the question bank covers the document's substantive provisions, its relationship to the post-2016 regulatory ecosystem, and its current operative status — the last of which, as these findings illustrate, is precisely where models are most likely to give users wrong information with high confidence. Under NDA, we can provide your evals team with licensed access to the full question set, complete with authoritative answer references drawn from the regulator's own published record, enabling you to run systematic coverage without exposing the underlying IP publicly.

Beyond the question bank, we can support you through focused regulatory deep-dives: structured sessions with our regulatory specialists on the specific documents and cross-reference networks that generate the highest failure rates in models. For the CPMI-IOSCO cyber resilience ecosystem — which spans the 2016 guidance, the 2018 endpoint-security strategy, the 2018 FSB Cyber Lexicon, and the 2020 FSB incident-response guidance — we can map the full citation and derivation graph and identify which inter-document relationships are most likely to produce the kind of temporal-attribution errors documented here. We can also generate synthetic correction pairs derived from the authoritative text: model-response / corrected-response pairs suitable for supervised fine-tuning or preference-data construction, grounded in the regulator's own language rather than in secondary commentary.

On an ongoing basis, we offer embedded eval coverage across a defined regulator portfolio, refreshed quarterly as regulations are amended, consultations are published, and new standards enter force. Given that the CPMI-IOSCO guidance moved into active revision in May 2026, a quarterly refresh cycle would ensure your models are tested against the current regulatory picture rather than a snapshot. We can also provide red-team consultation focused on regulator-specific failure surfaces — working with your safety and evals teams to design probes calibrated to the document structures, temporal dynamics, and cross-reference patterns that generate the failure modes you care most about. The goal is to make regulatory accuracy a measurable, improvable property of your models rather than an incidental gap that surfaces only when a user catches it.

← Back to summary Other AI Labs white papers →