AI Hallucination on Recommendation of the Council on Merger Review (2025 Revision) for Legal teams at Investment Banking firms in international jurisdictions

Executive Summary

Legal teams at international investment banking firms rely on the 2025 OECD Merger Review Recommendation (OECD/LEGAL/0333) as the authoritative reference point when advising deal teams on cross-border merger control strategy, designing remedies packages, and assessing procedural exposure across adherent jurisdictions. Across the four questions we tested against the primary text, AI tools produced wrong answers on all four, a 100% failure rate on this regulation.

The failures concentrate on two dimensions: AI tools either invented structural content that does not exist in the Recommendation (fabricating a standalone "International Co-operation" operative section and a "Monitoring" section that have no equivalent in the five-section RECOMMENDS clause) or imported doctrine from the EU merger regime and presented it as the OECD standard (re-characterising the Section IV.3 structural remedies preference as a three-tier fix-it-first/upfront-buyer/crown-jewel hierarchy). When challenged on these answers, the AI tools in most cases admitted uncertainty, confirming that the initial confident outputs were not grounded in the primary text.

For a Legal function that is routinely called on to provide rapid guidance on cross-jurisdictional merger filings and remedy design, these are not edge-case errors: they sit at the operational core of the Recommendation.

How AI gets this regulation wrong

The failures on this regulation split between AI tools that invented operative content with no basis in the primary text and one that grafted EU merger-control doctrine onto the OECD standard as if it were the Recommendation's own language. Both patterns share the same surface presentation: confident, structured, citation-ready answers that read as authoritative until the primary text is consulted. The table below breaks these down by failure type.

AI's Failure Mode	Count	Affected findings
Exposed Fabrication	3	Finding#1 · Finding#3 · Finding#4
Misattributed	1	Finding#2

What that means for your team

Every failure in this cell falls into the same risk category: the AI produces a wrong deliverable, an answer that looks complete and usable but materially misrepresents the Recommendation. For a Legal team advising on a live cross-border deal, that means the error travels directly into advice, deal memos, or remedies strategy before anyone catches it. The table below maps each finding to the specific practice exposure it creates.

Risk Impact	Count	Affected findings
Wrong deliverable	4	Finding#1 · Finding#2 · Finding#3 · Finding#4

When this affects your department

The Recommendation is live in your workflow at multiple deal stages. When a cross-border transaction triggers merger filings in multiple adherent jurisdictions, Legal is typically the function that maps procedural obligations across those jurisdictions, identifies which elements of the deal structure carry the highest remedies risk, and advises deal teams on sequencing. AI tools get queried at precisely these points, for a quick summary of what the Recommendation requires on substantive analysis, for a framing of the remedies hierarchy to inform internal deal discussions, or for a reference on the conditions under which a failing firm defence is available.

These are not peripheral uses; they are the bread-and-butter of cross-border M&A legal support.

The structural integrity of the Recommendation itself, how many operative sections it has, what each covers, is also a reference question that Legal teams field when updating internal training materials, briefing business lines, or responding to queries from compliance. If an AI tool tells a junior lawyer that the Recommendation has six operative sections (including "International Co-operation" and "Monitoring") when it has five, and that misinformation enters an internal training deck or a regulatory mapping document, the firm's stated understanding of its obligations under the Recommendation is wrong on the record.

For remedies strategy, the stakes are higher still. If Legal relies on AI output that presents a fix-it-first/upfront-buyer/crown-jewel sub-tier hierarchy as the Recommendation's standard, when that hierarchy is EU doctrine, not OECD, the firm's remedies proposals in non-EU jurisdictions could be designed against the wrong reference point.

Equally, if the failing firm defence is characterised to deal teams as a closed three-condition test when the Recommendation's language explicitly uses "inter alia" to preserve regulator discretion beyond those three elements, the firm may advise a client that a defence is available when the relevant authority is not bound to the same exhaustive framework. In cross-border transactions where multiple authorities are running concurrent reviews, these errors compound.

The findings at a glance

The four findings below cover the questions Legal teams at international investment banking firms are most likely to direct at AI tools on this regulation, the Recommendation's structure, the remedies hierarchy, the failing firm defence, and the scope of its operative commitments.

#	Finding title	Type	Citation ID
1	Fabricated operative section structure	Hallucination	RLB-F-INT-OECD-OECD-MERGER-REVIEW-RECOMMENDATION-2025-Q001
2	EU doctrine imported as OECD remedies hierarchy	Hallucination	RLB-F-INT-OECD-OECD-MERGER-REVIEW-RECOMMENDATION-2025-Q002
3	Failing firm defence mischaracterised as closed test	Hallucination	RLB-F-INT-OECD-OECD-MERGER-REVIEW-RECOMMENDATION-2025-Q005
4	Non-existent co-operation and monitoring sections invented	Hallucination	RLB-F-INT-OECD-OECD-MERGER-REVIEW-RECOMMENDATION-2025-Q006

Aggregate impact

Three of the four failures share the same structural pattern: AI tools answered questions about the Recommendation's operative architecture with invented content, fabricating sections that do not exist (a standalone "International Co-operation" section, a "Monitoring" section, six operative blocks where there are five) while omitting the actual Section V on ex-post assessment. This is not a matter of nuance or interpretation; the RECOMMENDS clause of the Recommendation runs to five numbered sections, and the AI tools consistently got the count, the labels, and the coverage wrong.

That these same tools, when challenged, acknowledged uncertainty about the primary text confirms the outputs were constructed rather than retrieved.

The fourth finding involves a different but equally significant failure mode: importing doctrine from another regime. The EU merger-control practice on structural remedies, fix-it-first, upfront buyer, crown-jewel, is well-established and widely known to practitioners. An AI tool presented this three-tier sub-hierarchy as the operative content of Section IV.3 of the OECD Recommendation, citing real OECD publications in support. The cited documents do not say what the AI claimed; the Recommendation's own text at Section IV.3 establishes only a standalone-business preference within the structural remedies tier, without the timing-based sub-ordering the AI described.

For a Legal team advising on remedies in an adherent jurisdiction that follows OECD guidance rather than EU practice, the error is directly actionable, and the pretextual citations give it a false air of authority.

Taken together, these failures cluster at the foundational level of what the Recommendation actually says and does. A Legal function that uses AI output to set the frame, for a deal memo, an internal training, a remedies proposal, or a failing-firm-defence analysis, is working from a distorted map. The compounding risk in the international investment banking context is that these errors are plausible to anyone who is not working from the primary text: the invented section labels sound reasonable, the EU sub-hierarchy is real doctrine (just in the wrong jurisdiction), and the failing firm conditions are partially correct.

That plausibility is precisely what makes them dangerous in a time-pressured deal environment.

What your team should do

The default position for this regulation is: do not use AI tools as the primary reference for any question about the Recommendation's operative content. The OECD documents the text of OECD/LEGAL/0333 directly on its legal instruments database, and the Recommendation is short enough that a Legal team can verify structural and operative questions against the primary text in under ten minutes. Given the failure rate observed across testing, the cost-benefit of an AI shortcut is negative for any work product that will carry the firm's name.

For remedies questions specifically, the jurisdictional source matters. If the relevant authority is an EU member state competition authority, EU merger-regulation practice governs and AI tools trained heavily on EU guidance may perform better, but that is a different question from what the OECD Recommendation itself requires. Internal templates, deal memos, and training materials that conflate the two standards create persistent confusion for teams working across both regimes.

The practical safeguard is a standing check: when AI output on this Recommendation references timing-based sub-tiers within structural remedies (fix-it-first, upfront buyer, crown jewels), treat that as a signal the tool has crossed into EU doctrine, and verify against the Recommendation's text before using.

For failing firm defence analysis, the "inter alia" qualifier in Section III.11.b is operationally significant, it preserves regulator discretion to require additional evidence beyond the three commonly-cited conditions. AI tools tested here presented those three conditions as an exhaustive, closed test. When advising a client on whether a failing firm argument is viable in a jurisdiction applying the OECD standard, that nuance matters: an authority that considers the three-condition formulation as non-exhaustive has more room to reject a defence than a client briefed on a closed test would expect.

That is a conversation to have with the primary text open, not with an AI summary.

How RLB Can Help

RegLeg's published hallucination research is available as a free pre-flight check your team can run before relying on AI output for any regulatory question covered in the corpus. If a finding shows that AI tools systematically misstate the scope of a reporting obligation, conflate two regulatory regimes, or invent an exemption threshold, your team has that on record before the output reaches a deal memo or a client advice note. That is a cheaper intervention than discovering the error in review, or after.

For Legal functions in international investment banking specifically, we can map which AI-supported workflows carry the highest hallucination exposure for your book of business: cross-border transaction structuring, multi-jurisdictional disclosure analysis, derivatives documentation review, sanctions and restricted-party screening workflows, and regulatory change tracking across overlapping regimes. The output is a prioritised exposure map scoped to the jurisdictions and product lines your team actually touches, not a generic AI-risk inventory.

Where a firm already has an AI-use policy in place, we can review it against RegLeg's failure-mode catalogue and return a prioritised remediation list: which policy assumptions are contradicted by documented failure patterns, where the policy is silent on known high-risk task categories, and what workflow controls would close the material gaps. We can also produce training material and CPD-aligned content your team can deploy internally, grounded in real failure cases, framed for Legal professionals who do not need a primer on what AI is, only on where it fails in their domain.