
Two frontier AI models running with web search enabled, both tested by the RLB Specialist Panel, produced confidently wrong reconstructions of the CPMI-IOSCO Consultative Report on Updated Guidance and Public Disclosures to Implement Initial Margin Proposals, the May 2026 consultative document (d232) that codifies how central counterparties must disclose initial margin responsiveness, override frameworks, and simulation tools.
The RegLeg Brief Specialist Panel tested both models on the consultation's text and on the January 2025 BCBS-CPMI-IOSCO report it implements, and documents findings in which the models softened mandatory CCP obligations into discretionary "should consider" language, invented a three-category taxonomy the source report does not contain, and fabricated a three-element public disclosure structure for override frameworks the guidance does not enumerate.
Claude Opus 4.7, asked what obligation the final CPMI-IOSCO guidance places on CCPs to provide margin simulation tools, wrote that CCPs "should consider" making the tools available to clearing members and, where feasible, their clients. The 2024 consultative text from which the obligation derives reads: "Margin simulation tools with certain minimum functionality should be made available by CCPs to clearing members and their clients." The model converted a positive obligation into a discretionary consideration, the difference between a CCP being expected to provide a tool and a CCP being expected to think about providing one.
Asked about the structure of the January 2025 BCBS-CPMI-IOSCO report, Opus 4.7 also asserted the report's ten policy proposals fall into "three broad categories" of CCP transparency, governance and clearing-member transparency. The report's published text describes ten proposals aimed at resilience of the centrally cleared market ecosystem; no three-category taxonomy is stated in the source.
Claude Sonnet 4.6 reproduced the same obligation-softening on margin simulation tools and added a fabricated disclosure structure for the override framework. Asked what CCPs must publicly disclose about their override framework, Sonnet 4.6 enumerated three elements: instances or circumstances where overrides may be warranted, the key decision-makers authorised to exercise override discretion, and the permissible types of adjustments. The consultative text says only: "CCPs should publicly disclose relevant information on their override framework." The three-element list is the model's construction, not the regulator's.
A CCP risk officer, clearing-member compliance lead, or supervisor drafting a comment letter, board paper, or implementation plan against either output would understate CCP obligations on simulation tools, structure their override-framework disclosure against a taxonomy that does not exist in the source, and cite a category framework for the underlying policy proposals that the BIS press release does not endorse. That is the failure mode these findings document.
The dominant failure observed in Claude Sonnet 4.6 on the CPMI-IOSCO Consultation on Updated Guidance and Public Disclosures to Implement Initial Margin Proposals is deontic register substitution, the model hardened a recommendation into a requirement, replacing 'should' with 'must' when characterising CCP override-framework disclosure obligations. The model identified the correct regulatory subject matter but resolved the consultation's conditional language in the wrong direction, citing a third-party law-firm summary that had editorially strengthened the modal register relative to the primary text. On consultation documents where the normative weight of each modal verb is operationally significant, this failure class carries direct downstream consequences for compliance planning. The failure points to a gap in how the model's retrieval configuration handles deontic precision when secondary-source paraphrase diverges from the regulator's primary text.
This is the consolidated view of findings. Click 'see details →' on any item for the full details for each finding.
This error implicates two subsystems simultaneously: the retrieval ranker's weighting of third-party secondary sources over the regulator's primary text for normative queries, and the model's calibration on deontic precision for consultation-class documents. The training corpus does not appear to carry a strong signal distinguishing 'should' in a consultation from 'must' in a final rule — the model resolved the ambiguity by defaulting to the stronger obligation register, which is the systematically wrong direction for consultation documents.
The Pretextual citation to a law-firm summary rather than the BIS primary text indicates the retrieval pipeline did not prioritise the authoritative source, and the model did not correct for the secondary source's editorial register change.
see details →Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.