Executive Summary
Across five questions put to AI tools about the 2025 revision of the OECD Merger Review Recommendation (OECD/LEGAL/0333), AI assistants produced a hallucination on every single one, a clean failure rate that should give any international M&A practitioner pause before reaching for these tools in a merger-control context. The dominant failure pattern is structural fabrication: AI tools consistently invented operative sections of the Recommendation, most often a standalone "International Co-operation" or "Transnational Co-operation" block, while simultaneously omitting the actual Section V (ex-post assessment of merger decisions and/or remedies).
On two separate questions probing the Recommendation's architecture, AI assistants produced the same invented structure and the same omission, suggesting the error is systematic rather than incidental. A separate misattribution failure saw AI tools graft EU merger-control practice, the fix-it-first / upfront-buyer / crown-jewel sub-hierarchy, onto Section IV.3 as though it were the OECD Recommendation's own text, with real OECD publications cited in support. For lawyers advising on multi-jurisdictional filings or jurisdictional compliance against this instrument, each of these errors carries direct professional-liability exposure.
How AI gets this regulation wrong
The failures recorded on this Recommendation cluster heavily around confident structural invention, AI tools generating plausible-sounding but incorrect accounts of the Recommendation's operative architecture, then retracting or hedging only when directly challenged. A secondary failure mode involves importation: AI assistants absorbed substantive doctrine from cognate jurisdictions or OECD commentary documents and presented it as the operative text of the Recommendation itself, citing real OECD publications to lend the output credibility it did not warrant.
| AI's Failure Mode | Count | Affected findings |
|---|---|---|
| Exposed Fabrication | 4 | Finding#1 · Finding#3 · Finding#4 · Finding#5 |
| Misattributed | 1 | Finding#2 |
What that means for your practice
The risk profile for lawyers relying on AI tools on this Recommendation skews heavily toward professional indemnity exposure: most failures arise in analysis that would directly feed a client-facing opinion, a filing position, or a defence of a specific procedural or substantive argument. One distinct cluster produces wrong deliverables at the technical drafting level, errors in the reporting-interval structure that would survive a plausibility check but be caught only on close primary-source review, by which point client instructions may already have been given.
| Risk Impact | Count | Affected findings |
|---|---|---|
| Liability / PI exposure | 4 | Finding#1 · Finding#2 · Finding#4 · Finding#5 |
| Wrong deliverable | 1 | Finding#3 |
When this affects Lawyers
International M&A lawyers encounter the 2025 Recommendation at several distinct pressure points: scoping a multi-jurisdictional filing strategy when an OECD-adherent jurisdiction's framework is being assessed for adequacy; advising a client on the failing firm defence in a contested merger review; and structuring remedies negotiations where the hierarchy of acceptable structural versus behavioural remedies is directly in play. Any of these tasks is a plausible candidate for an AI-assisted first-pass, a junior drafting a scope memo, a partner running a rapid comparator of OECD adherent jurisdictions, or a team prepping a call with agency staff.
The risk is sharpest when the AI output goes into a client deliverable without direct primary-source verification. The Recommendation's operative architecture, five numbered RECOMMENDS sections, is not an obscure technical detail; it is the foundational structure that determines which obligations apply and in what sequence. An opinion that attributes non-existent sections to the instrument, or that describes an ex-post assessment obligation as something the Recommendation contains only implicitly, is wrong in a way a counterparty, a regulator, or a court will notice.
The failing firm defence failure carries the highest individual exposure. Section III.11.b uses "inter alia" explicitly, signalling that the enumerated three conditions are not exhaustive, a point AI assistants dropped while simultaneously mischaracterising the third condition's content. A lawyer who takes that output at face value and advises a client that the defence requires only those three conditions, satisfied cumulatively and to a closed standard, has given advice that is factually incorrect in a merger review where the authority may require additional evidence.
The remedies-hierarchy failure, importing EU fix-it-first doctrine as OECD text, is equally dangerous in any OECD-framework jurisdiction that does not actually follow EU practice, because the advice would be pointing to the wrong normative baseline.
The findings at a glance
The table below summarises each finding, the question area, the nature of the AI failure, and the risk category, across all five hallucinations recorded on the 2025 OECD Merger Review Recommendation.
Aggregate impact
The pattern across these five findings is not random noise, it is a specific and repeatable failure to correctly render the Recommendation's top-level operative architecture. Three of the five findings involve AI tools generating accounts of the Recommendation's structure that include a "Co-operation" or "Monitoring" section as a standalone operative RECOMMENDS block, while consistently omitting Section V (ex-post assessment). This is not a failure to recall an obscure sub-clause; it is a failure on the headline structure of the instrument.
The fact that two separate questions about the Recommendation's scope, posed to multiple AI tools, produced structurally identical invented architectures suggests the tools have a stable but incorrect internal representation of this specific 2025 revision.
The remaining two findings expose a second, equally consequential failure mode: doctrinal importation. In the Section IV.3 remedies case, AI tools reproduced EU merger-control practice, a well-documented and widely-discussed sub-hierarchy for structural remedies, as though it were the OECD Recommendation's own operative text, and cited real OECD publications to support it. This is the most dangerous failure type for a practising lawyer, because the output is internally coherent, supported by real citations, and substantively plausible to a reader with EU merger experience. It would pass a casual review.
The failing firm defence finding is similar: the mischaracterisation of condition three (from a comparative-harm test to an assets-exit counterfactual) reflects the traditional three-part test familiar from US and EU doctrine, not the OECD text, and the "inter alia" qualifier was silently dropped.
Taken together, the errors cluster on the three most substantively consequential areas of the Recommendation for a transactional lawyer: the instrument's overall structure (which determines which obligations apply), the remedies hierarchy (which governs negotiation positioning), and the failing firm defence (which determines whether a merger can proceed at all). The reporting-timeline error is lower-stakes for a transactional practice but is the kind of clean technical mistake that would undermine a practitioner's credibility on a governance or compliance matter where precision on dates is expected.
What your team should do
The default position for any lawyer relying on AI tools for work touching this Recommendation should be: AI output is a pointer toward the primary text, not a substitute for it. The Recommendation is publicly available at OECD/LEGAL/0333, and its operative RECOMMENDS sections (I–V) take approximately two minutes to read. Given that AI tools have demonstrated a consistent, repeatable failure to correctly enumerate those sections, with a fabricated Co-operation section appearing in multiple separate AI runs, any AI-generated account of the Recommendation's structure must be verified against the text before it is used in a deliverable.
That verification takes seconds; the professional exposure from skipping it does not.
For failing firm defence work, the verification standard is higher. The "inter alia" qualifier in Section III.11.b is substantively significant, it means practitioners cannot treat the three enumerated conditions as a closed cumulative test, and cannot advise clients that satisfying those three conditions exhausts the authority's entitlement to require further evidence. The AI tools we tested dropped that qualifier and simultaneously got the content of the third condition wrong. Both errors are likely to survive a quick read of an AI-generated response if the reader's mental model is EU or US doctrine rather than the OECD text.
The safeguard is to read Section III.11.b directly and to flag to clients that the standard is authority-specific and non-exhaustive.
AI tools are safer on background framing that does not depend on textual precision: comparative summaries of OECD adherent jurisdictions' notification thresholds, general market context for a filing strategy, or drafting first-pass procedural timelines where the operative dates are separately verified. They are not safe for structural characterisation of the Recommendation, remedies-hierarchy analysis, failing firm defence conditions, or any output where an incorrect citation to an OECD document would be taken as authoritative.
In those areas, treat AI output as a rough orientation rather than a starting draft, and build in explicit primary-source review before the output moves to a client-facing document.
How RLB Can Help
RegLeg's published Hallucination Research is available as a free pre-flight check for lawyers working on regulatory matters. Before relying on AI-assisted output, whether for advice, drafting, or due diligence, lawyers can consult the research to understand which failure modes have been observed for the specific regulation in question. This is not a substitute for legal judgement, but it is a structured, independent reference that flags where AI tools have historically misfired, allowing practitioners to focus their human verification effort on the highest-risk points.
For firms where multiple lawyers work across the same regulatory portfolio, RegLeg offers bespoke deep-dive engagements. These go beyond the published research to examine the specific regulations, jurisdictions, and question types most relevant to the firm's practice. The output is a tailored briefing that legal teams can use as a standing reference, updated as the regulatory landscape evolves, giving the whole team a shared, consistent picture of where AI tools should be treated with caution and where they have performed reliably.
RegLeg also works with legal teams on training and CPD-aligned content. This covers the categories of failure lawyers are most likely to encounter, including outdated regulatory text, cross-jurisdictional confusion, and misattributed citations, framed around real regulatory examples rather than abstract AI theory. Separately, RegLeg can conduct a confidential review of a firm's existing AI-use policy, assessing it against the failure-mode catalogue the research has surfaced. The output is a structured gap analysis: which risks the policy already addresses, which it does not, and where practical amendments would strengthen the firm's position.
