Claude Code probes the mystery of confabulation in BBNJ biodiversity treaty AI cognition gaps.
— RLB Specialist Panel
SINGAPORE, June 10, 2026. Two frontier artificial-intelligence models generated structurally confident but textually wrong reconstructions of the Agreement under the United Nations Convention on the Law of the Sea on the Conservation and Sustainable Use of Marine Biological Diversity of Areas Beyond National Jurisdiction, known as the BBNJ Agreement, according to a white paper released today by RegLeg Brief, a regulatory-research outfit operated by Singapore-incorporated Verdus Technologies Pte. Ltd.
The six findings, published with immutable RLB Citation IDs including RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q003-Opus47, RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q003-Sonnet46, RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q001-Opus47, RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q001-Sonnet46, RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q004-Sonnet46, and RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q005-Opus47, span the Agreement's environmental impact assessment screening provision (Part IV, Article 27), its marine genetic resources and digital sequence information framework (Part II, Articles 10 and 14), and the non-undermining clause that bounds the Conference of the Parties (Part III, Article 22(2)).
Both Anthropic's Claude Opus 4.7 and Claude Sonnet 4.6 were tested with web search active, mirroring how marine policy advisers, deep-sea biotechnology lawyers, intergovernmental secretariat staff, and pharmaceutical compliance teams actually use the models on a treaty that entered into force on 17 January 2026.
The BBNJ Agreement is a structured instrument, divided into Parts and articles, with each operative obligation pinned to a specific provision number that downstream practitioners are expected to cite. Four of those provisions sit at the heart of the findings:
Each of these provisions states a single position pinned to a specific article number. The substantive content and the article number are jointly load-bearing: a citation that gets the content right but the article number wrong is still a defective regulatory citation, because downstream supervisors, panels, and counterparties will check the article number first.
Asked whether the BBNJ Agreement's marine genetic resources framework under Article 10(1) applies prospectively or retroactively, Claude Opus 4.7 (with web search on) wrote, verbatim:
"Under Article 10(1), the MGR/DSI provisions apply to 'utilization' of MGR and DSI of ABNJ that were collected or generated before entry into force, not just after."
The structural error. Article 10(1)'s actual text limits the regime to resources "collected and generated after the entry into force of this Agreement for each Party", and the non-retroactivity position was separately confirmed by most parties in formal declarations on deposit. The model inverted the temporal default. A deep-sea biotechnology firm or research institution treating the output as authoritative would treat legacy sample collections, decades of pre-2026 marine specimens held in university and biotech freezers, as falling inside Part II's notification, access, and benefit-sharing obligations the Agreement does not extend to them.
Asked on a separate question which article sets the EIA screening threshold for planned high-seas activities, Opus 4.7 identified Article 30 and stated that screening sits at Article 31. The verbatim screening-threshold provision is Article 27 (Part IV), and the "more than a minor or transitory effect" qualitative test the model paraphrased is the Article 27 language, not Article 30. The substrate document the panel cites identifies the article number directly.
On the non-undermining duty, Opus 4.7 wrote that "Article 5 / Article 8" of the Agreement "explicitly require it not to undermine relevant legal instruments, frameworks and competent global, regional, subregional and sectoral bodies." The verbatim non-undermining language the model paraphrased ("shall respect the competences of, and not undermine, relevant legal instruments and frameworks and relevant global, regional, subregional and sectoral bodies") is the text of Article 22(2), which bounds the Conference of the Parties specifically. The model's "Article 5 / Article 8" citation does not anchor to that text in the Agreement.
The failure modes are classified as misstated_rule (Article 10(1) inversion) and misattributed (Article 27 and Article 22(2) misnumbering) against substrate documents p_01_ACT_Part_III___Article_22_2____non_undermini_text-bbnj-agreement.html and p_01_ACT_Part_IV_EIA___Article_27___screening_thr_text-bbnj-agreement.html.
On the same retroactivity question, Sonnet 4.6 (with web search on) wrote:
"the benefit-sharing and notification obligations apply to the utilisation of MGRs, and associated DSI, collected or generated before the Agreement entered into force (17 January 2026). In practice this means... samples collected decades ago but first commercialised after the Agreement's entry into force would be subject to Part II requirements."
The fabrication. The Agreement contains no such "commercialised-after" trigger. Article 10(1) is binary: the regime applies to resources collected and generated after entry into force, and not to those collected before. The model constructed a "first-commercialisation" carve-in that the treaty does not contain, and grounded it in a confident reading of an entry-into-force date the treaty does establish.
A pharmaceutical compliance officer or technology-transfer lawyer building a Part II notification workflow on this output would route pre-2026 sample portfolios through a regulatory regime the BBNJ Agreement does not extend to them, and would issue contractual representations to counterparties on that basis.
Asked which article establishes the EIA screening threshold, Sonnet 4.6 identified "Part IV, Article 30" and quoted the "more than a minor or transitory effect" language. As with Opus 4.7, the screening-threshold provision is Article 27. The qualitative test the model quoted verbatim is correct; the article number it pinned that test to is not.
Asked whether the Agreement's benefit-sharing framework extends to digital sequence information derived from marine organisms in international waters, and which provision governs, Sonnet 4.6 attributed the obligation to "Part II (Article 15(5) and related provisions)". The DSI benefit-sharing duty sits in Article 14(1), which is explicit that benefits "and their digital sequence information shall be shared in a fair and equitable manner." Article 15 governs a different aspect of the Part II regime.
The failure modes are classified as inference_drift (Article 10(1) commercialisation carve-in) and misattributed (Article 27 and Article 14(1) misnumbering) against substrate documents p_01_ACT_Part_III___Article_22_2____non_undermini_text-bbnj-agreement.html, p_01_ACT_Part_IV_EIA___Article_27___screening_thr_text-bbnj-agreement.html, and p_01_ACT_Part_II___DSI_included_in_BBNJ_vs__exclu_text-bbnj-agreement.html.
The BBNJ findings sit inside a failure class the RegLeg Brief Specialist Panel labels Treaty Article Drift: frontier models locking onto a substantively coherent paraphrase of a treaty provision while reaching for an article number from the model's prior, rather than from the treaty's actual structure, and simultaneously inverting temporal-scope defaults on provisions where the treaty's express position is non-retroactive.
Across the six findings, the drift takes three shapes:
The common substrate is a generation pathway in which the model's prior about how a UN biodiversity treaty's articles are typically arranged overrides the BBNJ Agreement's actual numbering, while the verbatim regulator text is either not retrieved or retrieved but not allowed to override the article-number prior.
All six outputs shared the same surface characteristics: confident article-level citations, substantively plausible paraphrases of the treaty's provisions, defined-term usage ("MGR", "DSI", "ABNJ", "Part II", "Part IV") that tracks the Agreement's vocabulary, and no hedging or caveats. The failure is not recoverable by the user in real time because the output reads like a competent treaty-law brief, the kind of paragraph an international legal adviser would expect to receive from a senior associate or secretariat staffer.
Validation against the Agreement's primary text would only happen if the reader already knew which article number contained which subject matter, which is the question they asked the model in the first place.
The population most exposed includes marine policy advisers in foreign ministries and intergovernmental secretariats; deep-sea biotechnology and bioprospecting compliance teams at pharmaceutical and industrial-biotech firms; technology-transfer lawyers drafting MGR access and benefit-sharing contracts; environmental impact assessment consultants scoping high-seas activities for shipping, cable-laying, marine carbon dioxide removal, and seabed research operators; and academic researchers preparing regulatory submissions under domestic implementing legislation.
All of these workflows route through AI-assisted research on tight timelines, and almost all of them generate written deliverables, contracts, EIA scoping reports, supervisor-facing self-assessments, that downstream readers treat as authoritative without re-checking the cited article number against the deposited treaty text.
The RegLeg Brief Specialist Panel documents a series of red-team probe designs that any AI lab or alignment team can run against their own models with no commercial engagement required:
RegLeg Brief operates as a completely ungated, open-access public resource. The white papers, per-finding cards, regulator verbatim excerpts, RLB Citation IDs, methodology notes and supporting data logs are all published without paywalls, registration walls, or data-licensing fees. By documenting original regulatory research without financial or distribution barriers, the platform ensures that:
Because RegLeg Brief conducts its own original research and adversarial analysis against frontier AI models, the detail in each published finding is precise enough to enable AI labs to take targeted hallucination-mitigation measures. Directions an AI lab might consider, drawing on the published findings, include:
AI labs and model developers named in any published finding have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
These findings and associated work have been put up in public with a view of the greater good for the development of a safer AI ecosystem. Any party reading this or any finding on reglegbrief.com may contact us and have an unconditional right of reply; the Specialist Panel will publish any factual correction or contextual response alongside the original finding, with no editorial gatekeeping. Researchers, regulators, and compliance teams with questions on methodology or specific findings can reach the Specialist Panel via the same channel.
RegLeg Brief is operated by Verdus Technologies Pte. Ltd. (UEN 201616982R), incorporated in Singapore. The RLB Specialist Panel, with an aggregate of over 60 years of public-policy and industry experience, documents only confirmed hallucination findings, under a methodology that requires a verbatim regulator excerpt for every documented claim. All findings, citation IDs, model outputs, regulator excerpts, and methodology notes are open-access.
Primary source verified: UN BBNJ Agreement (2023), Agreement on the Conservation and Sustainable Use of Marine Biological Diversity of Areas Beyond National Jurisdiction · Substrate documents: p_01_ACT_Part_III___Article_22_2____non_undermini_text-bbnj-agreement.html, p_01_ACT_Part_II___DSI_included_in_BBNJ_vs__exclu_text-bbnj-agreement.html, p_01_ACT_Part_IV_EIA___Article_27___screening_thr_text-bbnj-agreement.html · UN portal: documents.un.org
Citation IDs referenced:
RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q001-Opus47RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q001-Sonnet46RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q003-Opus47RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q003-Sonnet46RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q004-Sonnet46RLB-H-INT-UNTC-BBNJ-HIGH-SEAS-BIODIVERSITY-AGREEMENT-2023-Q005-Opus47For AI Labs