How does a cross-domain engagement differ from a regulator-track engagement?

Substrate construction is different — the primary sources are clinical guidelines, tax rulings, court precedent, building codes, etc. instead of regulator instruments. Audit methodology is the same: substrate-bound verification, asymmetric question design, multi-subject AI testing, publish-only-negative findings.

Do you have findings in non-financial domains live today?

The current Hallucination Register is financial-regulation-focused. Cross-domain engagements scope a new substrate domain, build the corpus, run the audit, and publish the findings under the same Citation ID architecture.

Who commissions a cross-domain engagement?

AI labs extending model evaluation beyond regulation; sector regulators outside finance; professional bodies (medical associations, engineering bodies, accountancy bodies); operators in critical-accuracy domains (hospitals, airlines, pharmacies, large law firms, accounting firms); standards-setting bodies (NIST, ISO, ICAO, IAEA, etc.).

What do you need to scope a new domain?

Definition of the authoritative primary source set, identification of the AI subjects in production use, agreement on the failure-mode taxonomy that applies in the domain, NDA, and the engagement letter. Typical scoping window is 2–3 weeks.

Beyond regulation

Q: What makes a domain suitable for the RLB methodology?

Three conditions. An authoritative primary source exists in the domain. AI is being used to interpret or restate that source. Material consequences follow from misinterpretation. Wherever those three conditions hold, the methodology applies — financial regulation is one instance, not the limit.

The RegLegBrief methodology was developed for financial regulation. The underlying mechanics — verifying AI output against an authoritative primary source, classifying failure modes, publishing only negative findings with immutable Citation IDs — generalise to any critical-accuracy domain where the same three conditions hold.

Last updated 14 Jun 2026 · For: AI labs extending model evaluation · sector regulators outside finance · professional bodies · standards-setting organisations · operators in critical-accuracy domains

Bottom line

Wherever an authoritative source exists, AI is being used to interpret it, and getting it wrong has material consequences, the methodology works.

Financial regulation is one instance of a broader pattern. The pattern is: a primary source published by an authoritative body, AI being used to interpret or restate that source, and material professional, clinical, safety, financial, or legal consequences when the interpretation drifts.

Medical guidelines. Tax authority rulings. Court precedent. Building codes. Aviation safety standards. Cybersecurity frameworks. Drug interaction databases. Clinical trial protocols. The substrate changes; the verification layer does not.

In this playbook

Why the methodology generalises
The three conditions
Where it applies
Worked examples
Voices from beyond regulation
What RLB delivers
Scoping a new domain
Engagement model
FAQ

Already mapping out a cross-domain engagement? The inquiry form pre-fills with the cross-domain track selected. Describe the domain, the authoritative source set, and the AI subjects you want under audit.

Discuss a cross-domain engagement →

1. Why the methodology generalises

The RegLegBrief verification methodology was developed against financial regulation: prudential rules, conduct standards, capital markets instruments, AML/CFT frameworks. The substrate construction is regulator-specific. The verification mechanics are not.

What makes the methodology work in financial regulation is what makes it work anywhere else. An authoritative body (a regulator) publishes a primary source (an instrument). AI tools mediate how regulated entities and professionals interpret that source. The gap between what the AI says and what the instrument says is the hallucination surface. RegLegBrief documents that gap against the primary source, classifies the failure mode, and publishes the finding with an immutable Citation ID.

Replace "regulator" with "WHO" or "FDA" or "IRS" or "ICAO" or "NIST." Replace "instrument" with "clinical guideline" or "tax ruling" or "aviation safety standard" or "cybersecurity framework." The mechanics are identical.

2. The three conditions

For the methodology to apply to a new domain, three conditions must hold. Each is necessary; together they are sufficient.

Condition 1

An authoritative primary source exists

A document, body of documents, or maintained register issued by an authoritative source — a regulator, a standards body, a court, a clinical-guidance authority, a scientific consensus body. The source is the ground truth against which AI output is verified.

Condition 2

AI is being used to interpret it

Practitioners, operators, or end users are running AI tools over the source — summarising it, restating it in client deliverables, querying it for specific propositions, generating compliance or operational outputs from it. The AI sits between the source and the work product.

Condition 3

Material consequences follow from misinterpretation

Getting the interpretation wrong produces real harm: patient harm in clinical work, legal liability in tax or legal work, safety incidents in aviation or engineering, regulatory sanctions, professional discipline, or financial loss. Without material consequences, the methodology is interesting but not necessary.

Financial regulation satisfies all three. Medical guidelines satisfy all three. Tax authority guidance satisfies all three. Building codes, aviation safety, cybersecurity, drug interactions, clinical trial protocols — all satisfy the three conditions. The methodology applies to each.

3. Where it applies

Indicative cross-domain map. The substrate domain on the left; the primary source authority; the AI uses where verification matters; the target audience for the engagement.

Substrate domain	Primary source authority	Where AI verification matters
Clinical guidelines	WHO, NICE (UK), USPSTF, specialty society guidelines	Clinical decision support, AI scribes, prior-authorisation drafting, patient communication
FDA / EMA approvals + label	FDA, EMA, MHRA national drug regulators	Prescribing-information AI tools, off-label use guidance, drug-drug interaction queries
Drug interaction databases	Pharmacopeial bodies, Lexicomp, Micromedex	Pharmacist AI assistants, prescribing copilots, automated interaction checks
Tax authority guidance	IRS rulings, HMRC guidance, tax treaty texts, OECD model	AI tax preparation, advisory opinion drafting, transfer-pricing analysis
Court precedent / case law	Federal, state, and national court systems; international tribunals	Legal research AI, brief drafting, precedent analysis, AI litigation tools
Building codes & safety standards	National building codes, ICC, ISO, ASTM, BSI	AI design-review, code-compliance checking, AI specification drafting
Aviation safety	FAA, EASA, ICAO Annexes, manufacturer maintenance manuals	AI maintenance-procedure look-up, AI flight-crew decision support, AI safety-management documentation
Cybersecurity frameworks	NIST CSF + AI RMF, ISO 27001, CIS Controls, ENISA guidance	AI compliance-mapping tools, control-language drafting, gap analysis
Clinical trial protocols	ICH-GCP, FDA IND requirements, EU CTR, IRB / ethics standards	AI protocol drafting, deviation classification, regulatory submission preparation
Accounting standards	IFRS Foundation, FASB, national-GAAP setters	AI technical-accounting opinion drafting, restatement memos, audit AI tools
Engineering codes	ASME, IEEE, IEC, professional engineering bodies	AI design-validation, AI standards-compliance checking
Scientific consensus documents	IPCC assessment reports, NIH consensus, Cochrane reviews	AI science-communication tools, evidence-synthesis AI, policy-support AI

The list is indicative, not exhaustive. The three conditions are the qualifying test, not membership of this list.

4. Worked examples

Three illustrative engagements showing how the methodology maps to non-regulatory substrates. Each follows the same audit shape: substrate construction, asymmetric question design, multi-subject AI testing, primary-source verification, finding publication.

Medical — clinical guideline AI

Verifying AI restatements of WHO HIV treatment guidelines

Substrate: WHO consolidated guidelines on HIV antiretroviral therapy, current edition. AI subjects: clinical-decision-support AI tools used by health workers in primary care. Audit shape: probe each AI subject on regimen selection, dosing, monitoring intervals, and contraindications. Verify each AI output verbatim against the WHO guideline. Failure modes: outdated regimen (training data from superseded edition), misstated dosing (averaged from secondary summaries), misattributed contraindication (false co-citation pattern). Publish findings with Citation IDs that AI vendors and health-system buyers can audit against.

Tax — IRS guidance AI

Verifying AI restatements of IRS Revenue Rulings in advisory opinions

Substrate: IRS Revenue Rulings, Treasury Regulations, and applicable Internal Revenue Code provisions. AI subjects: tax-preparation AI tools and AI assistants used by CPAs and tax attorneys for advisory opinion drafting. Audit shape: probe each AI subject on revenue ruling specifics, applicable thresholds, and procedural requirements. Verify against the actual IRS-published rulings. Failure modes: superseded ruling cited as current, threshold drift (16% becomes 18%), misattributed propositions across siblings. Findings inform AI vendor remediation and CPA-firm verification workflows.

Cybersecurity — NIST AI RMF

Verifying AI compliance-mapping tools against the NIST AI Risk Management Framework

Substrate: NIST AI Risk Management Framework (AI RMF 1.0) and Generative AI Profile. AI subjects: governance-tech AI tools that automate AI RMF compliance mapping for enterprise customers. Audit shape: probe each AI subject on RMF function names, subcategory references, and Profile-specific guidance. Verify against the NIST-published framework. Failure modes: subcategory misattribution, Profile guidance restatement drift, conflation of AI RMF with cybersecurity CSF. Findings expose where AI compliance tools confidently produce non-compliant compliance.

Live finding set — Platform documentation

Confirmed Opus 4.7 hallucinations on seven external platforms (live, published)

Substrate: live developer documentation, signup URLs, and registration flows of seven external platforms (Naver Search Advisor, Seznam Webmaster Tools, Brave Search, Kagi, Mojeek, ROR, Flipboard). AI subject: Claude Opus 4.7. Audit shape: capture verbatim AI assertion at moment of utterance, probe each asserted URL or process against the live platform, publish only confirmed contradictions with primary-source evidence anyone can re-verify. Captured during RegLegBrief's own distribution build, 2026-06-16. View the seven findings →

5. Voices from beyond regulation

Authoritative voices on AI accuracy in critical-accuracy domains beyond financial regulation. Standards bodies, medical journals, sectoral regulators, and global health authorities — all naming the same surface RegLegBrief audits.

"Citation of AI-generated material as a primary source is not acceptable."

— NEJM AI Editorial Policies, Massachusetts Medical Society (applied across NEJM, NEJM Evidence, and NEJM Catalyst). NEJM AI

"The production of confidently stated but erroneous or false content (known colloquially as 'hallucinations' or 'fabrications') by which users may be misled or deceived."

— National Institute of Standards and Technology (U.S. Department of Commerce), NIST AI 600-1, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, July 2024 — definition of "Confabulation," the first of twelve generative-AI-specific risks. NIST AI 600-1 (PDF)

"Generative AI technologies have the potential to improve health care but only if those who develop, regulate, and use these technologies identify and fully account for the associated risks."

— Dr Jeremy Farrar, Chief Scientist, World Health Organization, on release of WHO guidance on large multi-modal models, 18 January 2024. WHO

LLMs that summarise medical notes "can hallucinate or include diagnoses not discussed in the visit," with "unforeseen, emergent consequences."

— Dr Robert M. Califf, Commissioner of Food and Drugs, U.S. FDA (with FDA colleagues), FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine, JAMA, published online 24 October 2024. JAMA

"There is not yet a readiness for tax authorities to completely endorse an LLM functionality."

— Danny Werfel, Commissioner of the Internal Revenue Service (2023–2025), interview to CBS News on AI use in tax preparation. CBS News

6. What RLB delivers

Domain substrate construction — definition and acquisition of the authoritative primary source set for the domain, with provenance tracking and version control.
Asymmetric question design — domain-specific audit questions calibrated to surface failure modes that matter (clinical, fiscal, safety, legal).
Multi-subject AI testing — RLB standard subjects (Sonnet 4.6, Opus 4.7, third subject) plus any domain-specific AI tools the partner names.
Primary-source verification — every AI claim checked verbatim against the substrate. Failures classified into domain-appropriate failure modes.
Finding publication — with immutable Citation IDs in the same RLB-H- architecture used for financial regulation; cross-linked into the public register if the partner agrees.
Right of reply on every finding — the authoritative source body (or its delegate) is invited to respond before publication.
Domain-specific awareness — continuous monitoring once the domain is scoped, with alerts on new authoritative-source publications and emerging AI hallucination patterns.

7. Scoping a new domain

A cross-domain engagement begins with a 2–3 week scoping window before the audit phase. Five things are agreed during scoping:

The authoritative source set. Which documents, registers, or maintained corpora constitute the ground truth for the domain. Provenance, edition control, and update cadence.
The AI subjects. Which AI tools are under audit: foundation models in standard testing, partner-named domain-specific tools, or both.
The failure-mode taxonomy. Which failure modes apply in this domain. The financial-regulation taxonomy (inference drift, misstated rule, misattributed, outdated) generalises but domain-specific failure modes may need to be added (e.g., off-label conflation in medical, deprecated-regimen continuation in clinical guidance).
Publication policy. Whether findings are published in the public RLB register, kept private under NDA, or published with a defined embargo window.
Right-of-reply intake. Whether the authoritative source body wants to participate in right-of-reply on findings, and on what cadence.

Once scoping is agreed, the audit phase runs the same shape as the financial-regulation engagements: substrate-bound audit, multi-subject probe, verbatim verification, finding publication.

8. Engagement model

Services-led. Cross-domain engagements are scoped by domain breadth, AI-subject count, and publication policy.

Scope dimension	Typical cross-domain engagement
Domain breadth	A single substrate (e.g., WHO HIV guidelines current edition) or a thematic cluster (e.g., FDA approvals for a therapeutic area)
AI subjects under audit	RLB standard subjects (Sonnet 4.6, Opus 4.7, third subject) plus partner-named domain-specific tools (specialty CDS tools, tax-prep AI, legal-research AI, etc.)
Audit cadence	Single audit, recurring (quarterly / semi-annual), or continuous monitoring
Publication policy	Public Citation ID register, embargo + publish, or private NDA-only
Source-body participation	Authoritative source body engages in right of reply, in industry sensitisation, or stands by silently
Confidentiality	NDA governs the engagement; named tools and named source bodies handled per the engagement letter

Typical first engagement: a single substrate (one guideline, one rule set, one framework), with RLB standard subjects plus one partner-named tool, single audit cycle, public publication policy.

9. FAQ

Is RegLegBrief licensed or accredited in non-regulatory domains?

RegLegBrief is not a regulator, clinician, tax authority, or any other authoritative source body. RLB is a verification service that audits AI output against authoritative sources. The authority remains with the source body; RLB documents how AI is restating what that body has said.

How is this different from a domain-specific AI evaluation startup?

Most domain-specific AI evaluation startups build their own benchmarks (often AI-generated test questions). RLB's substrate is the actual published authoritative source — the WHO guideline, the IRS ruling, the NIST framework — and the verification is verbatim against that source. The methodology is source-bound, not benchmark-bound.

Can AI labs commission a cross-domain engagement directly?

Yes. AI labs extending model evaluation beyond regulation are a primary commissioning audience. The engagement scopes a new domain substrate, runs the audit, and delivers findings the AI lab can use in pre-deployment evaluation and post-deployment monitoring for that domain.

What about domains where the "authoritative source" is contested or evolving?

The methodology requires that the source be authoritative at the time of audit. Scientific consensus shifts; case law evolves; regulator interpretations update. Findings are versioned to the substrate edition under audit, and refreshed when the substrate is updated. Contested authority within a domain is handled in scoping — typically by selecting a single agreed authoritative reference for the audit period.

How does this interact with the existing financial-regulation register?

Cross-domain findings live in the same Citation ID architecture as financial-regulation findings. The public Hallucination Register currently surfaces the financial-regulation findings; cross-domain findings can be surfaced separately, integrated, or published on a domain-specific subdomain depending on the partner's preference.

What's the smallest engagement you'll scope?

A single-substrate audit on a defined AI subject set, single cycle. Typically 6–10 weeks end to end including scoping. Smaller than that and the substrate construction cost dominates; we'd point you at a different engagement shape.

Ready to scope a cross-domain engagement? The inquiry form pre-fills with the cross-domain track selected. Describe the domain, the authoritative source set, the AI subjects you want under audit, and your publication preference.

Discuss a cross-domain engagement →