AI Labs · Last updated 7 Jun 2026 · methodology vv2.3 · Hallucination Register

CFTC Digital Asset Collateral Staff Guidance 2025: Hallucination Patterns across frontier AI models

This whitepaper presents the hallucination findings from RegLeg's audit of the CFTC December 2025 digital asset collateral package. Three questions were drawn from the operational text of Staff Letter 25-40 and its February 2026 reissuance as Staff Letter 26-05. Each question targeted a specific operational rule rather than a general framework summary, the configuration most likely to surface a hallucination if one exists.

When this affects AI Labs

The CFTC's digital asset collateral framework is short, recent, and not yet consolidated into a formal CFTC rule. The staff letters are the operative text, and the framework's content is exactly the kind of jurisdiction-specific, high-stakes, fast-moving material that web-search-augmented AI tools are expected to handle. When an AI lab's models are used by FCMs, stablecoin issuers, prime brokers, and their advisers, the downstream artefact is a compliance memo, an eligibility opinion, or a calendar entry in the firm's reporting cycle. Each of the failures documented here would reach one of those deliverables without an external check.

Aggregate impact

Across three questions and five model configurations, the audit produced five hallucinations and zero correct answers. The failures cluster around the dropped-qualifier pattern (headline rule captured, cross-reference lost) and the obligation-inversion pattern (continuing obligations described as time-boxed). Both patterns share a generation profile that suggests retrieval surfaces secondary summaries faster than it surfaces the controlling enumeration in the staff letters themselves.

What your team should do

On the training-data side, the cluster argues for explicit ingestion of CFTC Market Participants Division staff letters into the retrieval corpus, with structural emphasis on enumerated conditions: which conditions cease at a phase boundary, which continue, and which cross-reference an external interpretive authority. On the post-training side, the inversion finding is the more instructive failure: the model reached its answer by inferring a phase-bounded structure from the framework's three-month onboarding period rather than reading the literal enumeration. On the probe-design side, the multi-DCO haircut question is a useful template for surfacing dropped-qualifier failures in the CFTC, prudential, and securities-law corpora.

How RLB can help

RegLeg runs targeted audits of frontier-model behaviour on regulatory text. For a frontier lab, the practical handle is the probe set: the questions that surfaced these failures were drawn from operational questions an FCM's compliance counsel would actually ask of a deployed model. We can deliver the probe set, the source substrate, and the expected-answer enumeration as a regression-testing input for a labeled model checkpoint, and re-run the audit against successor checkpoints to measure delta.

← Back to summary Other AI Labs white papers →

Every finding on this page compares an AI subject's account of the rule against the regulator's verbatim text from the regulator's own portal. Both are linked. Each delta, its root causes, and impact analysis are documented and published with immutable Citation IDs.