AI Labs · Last updated 7 Jun 2026 · methodology vv2.3 · Hallucination Register

IMF Surcharge Reform 2024: Numeric Baseline Failures Across Frontier AI Model Configurations

📰 Read the public briefing for this regulation →

Specialist Panel: Frontier AI models misread IMF Charges & Surcharge Reform (2024)

SINGAPORE, June 10, 2026. Two frontier AI models running with web search enabled, both tested by the RLB Specialist Panel, produced confidently wrong reconstructions of the headline baseline figure in the International Monetary Fund's October 2024 surcharge reform, in findings released today by the RegLeg Brief Specialist Panel. Asked how many IMF member countries were paying surcharges immediately before the reform took effect on 1 November 2024, both models committed to a specific integer that diverges from the Fund's own published count, and arrived at the same wrong number through different failure paths.

Claude Opus 4.7, queried on the immediate impact of the reform and the projected count through fiscal year 2026, answered that "before reform: 19 IMF member countries were paying surcharges" and that "after 1 November 2024: 11 countries continue to pay surcharges," with a net of eight countries released. The IMF Executive Board's published record, in press release PR/24/385 dated 11 October 2024, states that the number of surcharge payers is expected to decline from 20 to 13 countries in FY2026.

The model dropped one country from the pre-reform baseline and reconstructed the post-reform count from a net-release figure rather than from the regulator's published projection.

Claude Sonnet 4.6, on the same question, answered that "before the reform: 19 countries were paying surcharges" and that "after the reform took effect: 11 countries remain subject to surcharges." Sonnet 4.6 attributed the figures to Green Central Banking reporting on IMF Board data, then offered a separate "pre-reform baseline was 20 surcharge-paying countries" line in the same response, surfacing both the regulator figure and the wrong figure without resolving the conflict in favour of the regulator.

A sovereign debt economist, IMF country team, finance ministry desk officer, or research analyst drafting a surcharge impact note against either output would publish a pre-reform baseline one country short of the Fund's own count and a post-reform projection short of the FY2026 figure the Board approved.

Executive summary

Both Claude Opus 4.7 with web search and Claude Sonnet 4.6 with web search produced the same wrong pre-reform baseline when asked about the IMF's October 2024 surcharge reform, citing 19 surcharge-paying countries where the IMF's own published record establishes 20. The regulation is the IMF Charges and Surcharge Reform (2024), effective 1 November 2024, with explicit before/after country counts in the Board's published documentation. The error is not a paraphrase: both models committed to a specific integer that diverges from the regulator's figure, arriving at the same wrong number via different failure paths, one reconstructing from training, one deferring to a third-party source that had already introduced the error. When two models converge on the same specific wrong number through different mechanisms, it signals the correct figure is systematically under-indexed relative to the widely-circulated wrong figure in content both training pipelines and live retrieval draw from.

Findings — impact summary

This is the consolidated view of findings. Click 'see details →' on any item for the full details for each finding.

Finding on 'Q004 Probe' for Claude Opus 4.7 with web search ONRLB-H-INT-IMF-IMF-CHARGES-SURCHARGE-REFORM-2024-Q004-Opus47
This error implicates the training-data corpus for IMF policy content: the model held the wrong pre-reform baseline (19) as a confident fact rather than retrieving the primary document to verify. The failure is training-side — the correct integer appears to have been absent or lower-ranked in the content the model learned from, likely because secondary commentary circulated the wrong figure before the IMF's authoritative text was widely indexed. The post-reform figure was correct, indicating the error is not a general gap in knowledge of the reform but a specific wrong value baked into the training-data representation of the pre-reform state.
see details →

← Other AI Labs white papers The detailed Case study →

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.