
Two frontier AI models running with web search enabled, both tested by the RLB Specialist Panel, produced confidently wrong reconstructions of the CFTC's 2024 amendments to Regulation 1.25, the rule governing permissible investments of customer segregated funds by futures commission merchants and derivatives clearing organizations.
The RegLeg Brief Specialist Panel tested both models on the size-triggered 50 per cent concentration limit for government money market funds and qualified Treasury ETFs, the dollar-weighted average maturity (DWAM) limit and its carve-out set, and the separate March 31, 2025 compliance date for the Segregation Investment Detail Report and customer risk disclosure statement, and documents findings in which the models dropped the asset-size and management-company-size triggers that activate the 50 per cent ceiling, inverted the DWAM exclusion set by swapping the carved-out asset classes for unrelated ones, and drifted on the SIDR compliance anchor into a generic "roughly six months to a year after the effective date" formulation that does not match the regulator's published deadline.
Claude Opus 4.7, asked which concentration limits apply to government money market funds and Treasury ETFs under the 2024 amendments and whether any tiered or size-based thresholds exist, wrote that "the final rule did not adopt tiered or FCM-size-based thresholds, the percentage limits apply uniformly regardless of FCM size." The regulator's text in 17 CFR 1.25(b)(3)(ii) is unambiguous that a size-based trigger does exist, keyed to the fund's own assets and to its management company's assets: investments in government money market funds or qualified ETFs whose fund assets are at least $1 billion and whose management company manages at least $25 billion may not exceed 50 per cent of total segregated assets.
The model surfaced the FCM-side question correctly while dropping the fund-side and management-company-side triggers that actually govern the 50 per cent ceiling. On the DWAM question, Opus 4.7 wrote that "U.S. Treasuries held under repurchase agreements" are excluded from the portfolio-level 24-month dollar-weighted average maturity calculation. The regulator's carve-out set in 17 CFR 1.25(b)(3)(iv) is government money market funds, Treasury ETFs, and foreign sovereign debt; U.S. Treasury repos are not part of the carve-out. The model inverted the exclusion set.
Asked separately for the SIDR and customer risk disclosure compliance date, Opus 4.7 anchored the deadline at "a separate, later date (commonly described as roughly six months to a year after the effective date)" where the regulator's published anchor is March 31, 2025.
Claude Sonnet 4.6 reproduced the size-trigger elision and added a fabricated tier structure. On the concentration-limit question, Sonnet 4.6 wrote that "there is no size-based tier that changes the percentages based on the FCM's total assets" and then described a Tier 1 "no more than 10 per cent of total assets held in customer segregation may be invested in any single government money market fund" rule.
The regulator's text does not key the 50 per cent ceiling to FCM size; it keys it to fund asset size and management company asset size, the exact triggers the model dropped while presenting an FCM-size-negation answer that misdirects the reader away from the actual governing thresholds. On the DWAM question, Sonnet 4.6 wrote that the 2024 amendments "do not impose a new dollar-weighted average maturity (DWAM) standard or a maximum remaining-maturity cap specifically on direct U.S.
Treasury obligations" and concluded that "no DWAM standard or individual-maturity cap found in the 2024 amendments applies to that category." The regulator's text imposes a 24-month portfolio-level DWAM that applies to direct U.S. Treasury obligations by default; the carve-out covers government money market funds, Treasury ETFs, and foreign sovereign debt, not direct Treasuries. The model returned a no-standard answer to a question the regulator answers with a standard.
A futures commission merchant chief risk officer, derivatives clearing organization treasury team, customer-funds compliance officer, or regtech tool drafting an investment policy statement, scoping segregated-fund concentration testing, or scheduling SIDR and customer risk disclosure updates against either output would set the wrong concentration ceiling triggers, exclude the wrong asset classes from DWAM testing, and miss the regulator's March 31, 2025 compliance anchor. That is the failure mode these findings document.
Both Claude Opus 4.7 and Claude Sonnet 4.6 (each with web search active) failed on the CFTC's 2024 amendments to Regulation 1.25 across the rule's three operative pillars: the size-triggered 50 per cent concentration ceiling under 17 CFR 1.25(b)(3)(ii), the 24-month portfolio DWAM standard under 17 CFR 1.25(b)(3)(iv) with its narrow carve-out set, and the separate March 31, 2025 SIDR compliance anchor. Across five confirmed failures (three from Claude Opus 4.7, two from Claude Sonnet 4.6), the dominant pattern is threshold-trigger elision combined with carve-out inversion: models surface one axis of a multi-condition rule correctly while dropping the axes that actually govern, substitute adjacent asset classes for narrow exclusion sets, and drift from date-certain anchors into generic relative ranges. Every failure is classified as inference_drift. The signal for an AI lab is that for amended regulatory frameworks of this profile (final rule introduces new conditional structure layered on a legacy codification), retrieval is not surfacing the primary rule text at sufficient authority weight to dislodge a trained-schema prior, and generation defaults to a confident schema substitution rather than refusal or hedging.
This is the consolidated view of findings. Click 'see details →' on any item for the full details for each finding.
Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.