AI Hallucination Research › Methodology

Methodology

Last updated: 2026-05-26 · Version 2.1

The RLB Specialist Panel catalogues the hallucinations and blind spots persisting in AI models when they answer regulatory questions. Every finding is verified against the authenticated primary regulator source. Every finding carries a unique Citation ID and stays permanently on the record. This page documents how we test, how we categorise what we find, and how we verify before publication.

Who we partner with

This research exists to serve four audiences:

Partnership inquiries: /partnership/.

What we test

AI models in scope

We test these two AI models in their consumer interface, with web search enabled — the natural posture of a working practitioner who turns to a general-purpose AI tool with a regulatory question.

These two models are in public scope because we pay for the model access ourselves, and our findings on them are indicative of risks prevailing across most current AI models. Under private collaborations with other AI model developers, we apply the same methodology to their models and work with them to strengthen against the 7 types of hallucinations and 2 types of blind spots.

How we categorise what we find

Hallucination Modes — when AI gives a wrong answer

Every confirmed hallucination is classified into one of seven public Modes. Four describe how the answer itself goes wrong:

ModeMeaning
Misstated RuleAI misstated what the current rule actually says, directly contradicting the regulator's authenticated text.
OutdatedAI gave an answer that was correct for an earlier version of the rule; the rule has since been amended, superseded, or revised.
MisattributedAI confused this regulation or regulator with a different one — applying the wrong source's rules to the question.
Inference DriftAI reasoned beyond the rule's actual text, then asserted that inference as if it were the regulator's own position — when the regulator never said any such thing.

Three more describe how the AI's cited sources go wrong:

ModeMeaning
FabricatedCited URL does not exist or contain relevant information.
PretextualCited as a source, but not the actual basis for the AI's answer.
ContradictoryAI cited this source to support its answer; however this source contradicts the AI's answer instead of supporting it.

AI Blind Spots — when AI gives no answer

A separate category, parallel to hallucinations. A Blind Spot is when the AI refuses or fails to answer a regulatory question — but the answer is publicly available and can be found with a simple Google search.

How we verify

Every finding goes through a two-stage Panel review before publication.

Stage 1 inspects the AI's response against the authenticated regulator source and flags candidate hallucinations and blind spots. Stage 2 re-verifies each flagged item against the source independently — confirming or rejecting it. Only confirmed findings are published. Rejected items (false positives at Stage 1) are logged internally for ongoing quality improvement of the Panel process.

Pre-publication review. As standard, we do not pre-clear findings with model owners when we pay for the model access. Pre-publication review of findings affecting your model is available under paid AI Labs partner engagements only.

Right of reply. Vendors, regulators, and affected parties may submit a right of reply at any time. We amend findings where factual correction is warranted. Our aim is to work together to help the industry overcome the risks of AI hallucinations and blind spots — we are not against any party.

Citation IDs

Every published finding carries a unique Citation ID — a permanent, citable reference to that specific verified discovery by the RLB Specialist Panel.

The Citation ID format is:

RLB-{H|B}-{Jurisdiction}-{Body}-{Regulation}-Q{seq}-{Model}

Where H indicates a Hallucination and B indicates a Blind Spot. Examples:

Citation IDs are immutable. When a finding is updated — because the underlying regulation has been amended, or new substrate has surfaced — the original Citation ID stays accessible as a historical record, and the current version receives a version suffix. External citations always resolve to the version that was originally cited.

Each finding's Citation ID anchors the URL on its source page, so a Citation ID can be shared as a direct link to the Citation Card.

Audience tagging

Every confirmed finding is tagged with the audiences likely to encounter the hallucination or blind spot in real workflows. The taxonomy covers:

The 20 corporate departments

Each scope description below also appears as a hover tooltip on the matrix headers and case study badges across the site.

#DepartmentScope
1ComplianceInternal regulatory monitoring, policy enforcement, control framework oversight.
2LegalGeneral counsel, contracts, litigation, M&A legal, employment law, intellectual property.
3FinanceFinancial reporting, FP&A, accounting, capital allocation; includes investment / portfolio management in asset-management firms.
4RiskEnterprise risk management, operational risk, market and credit risk; includes actuarial work in insurance.
5TreasuryCash management, FX, hedging, debt management, banking relationships.
6TaxDirect tax, indirect tax, transfer pricing, tax compliance, tax planning.
7OperationsBusiness operations, production, customer operations, service delivery, process management.
8Technology & DataIT, software development, cybersecurity, data management, AI / ML operations.
9Human ResourcesTalent acquisition, employee relations, compensation, training, HR compliance.
10Internal AuditInternal audit, controls testing, audit committee support, fraud investigations.
11Governance & Company SecretarialBoard administration, corporate governance, regulatory filings, corporate records.
12Product & Business DevelopmentProduct management, product strategy, business development, partnerships; includes strategy and corporate development.
13Marketing & CommunicationsBrand, marketing, PR, internal communications, content.
14Procurement & Supply ChainSourcing, vendor management, supply chain operations, contract management.
15ESG & SustainabilityEnvironmental, Social, Governance reporting; sustainability strategy, climate risk, disclosure.
16Regulatory AffairsExternal regulator interface; preparation and submission of dossiers, registrations, approvals, marketing authorisations; ongoing regulator dialogue.
17Health, Safety & Environment (HSE)Workplace safety, environmental compliance, hazard management, incident response, HSE certifications.
18Investor RelationsDisclosure obligations, shareholder communications, analyst relations, AGM / EGM coordination, market-abuse compliance.
19R&D / EngineeringResearch and development, product innovation, technical engineering; includes medical affairs and clinical research in pharma / biotech.
20Quality AssuranceGxP / ISO quality systems, quality control, quality management systems, supplier quality, product release.

Case studies are published for each audience: at the (jurisdiction × profession) level for practitioners, and at the (jurisdiction × sub-sector × department) level for sector firms. Each case study demonstrates the audience-specific exposure analysis and practical safeguards.

Partnership

RLB partners on a services-led basis. We do not licence our research substrate or question banks as datasets. We engage to apply the RLB Specialist Panel's expertise to your specific need.

Beyond regulation. The methodology — verifying AI outputs against authoritative primary sources, classifying failures into Hallucination Modes and Blind Spots, and tagging by audience — applies to any critical-accuracy domain where authoritative sources exist and the consequences of AI misinformation are material. Under AI Labs partner engagements, we can extend the programme to medicine, personal tax, investment research, banking product disclosures, court precedent, professional standards, and other domains. Regulation is the first published application; the methodology travels.

Submit a partnership inquiry: /partnership/.