The RegLegBrief methodology was developed for financial regulation. The underlying mechanics — verifying AI output against an authoritative primary source, classifying failure modes, publishing only negative findings with immutable Citation IDs — generalise to any critical-accuracy domain where the same three conditions hold.
Last updated 14 Jun 2026 · For: AI labs extending model evaluation · sector regulators outside finance · professional bodies · standards-setting organisations · operators in critical-accuracy domains
Financial regulation is one instance of a broader pattern. The pattern is: a primary source published by an authoritative body, AI being used to interpret or restate that source, and material professional, clinical, safety, financial, or legal consequences when the interpretation drifts.
Medical guidelines. Tax authority rulings. Court precedent. Building codes. Aviation safety standards. Cybersecurity frameworks. Drug interaction databases. Clinical trial protocols. The substrate changes; the verification layer does not.
The RegLegBrief verification methodology was developed against financial regulation: prudential rules, conduct standards, capital markets instruments, AML/CFT frameworks. The substrate construction is regulator-specific. The verification mechanics are not.
What makes the methodology work in financial regulation is what makes it work anywhere else. An authoritative body (a regulator) publishes a primary source (an instrument). AI tools mediate how regulated entities and professionals interpret that source. The gap between what the AI says and what the instrument says is the hallucination surface. RegLegBrief documents that gap against the primary source, classifies the failure mode, and publishes the finding with an immutable Citation ID.
Replace "regulator" with "WHO" or "FDA" or "IRS" or "ICAO" or "NIST." Replace "instrument" with "clinical guideline" or "tax ruling" or "aviation safety standard" or "cybersecurity framework." The mechanics are identical.
For the methodology to apply to a new domain, three conditions must hold. Each is necessary; together they are sufficient.
Financial regulation satisfies all three. Medical guidelines satisfy all three. Tax authority guidance satisfies all three. Building codes, aviation safety, cybersecurity, drug interactions, clinical trial protocols — all satisfy the three conditions. The methodology applies to each.
Indicative cross-domain map. The substrate domain on the left; the primary source authority; the AI uses where verification matters; the target audience for the engagement.
| Substrate domain | Primary source authority | Where AI verification matters |
|---|---|---|
| Clinical guidelines | WHO, NICE (UK), USPSTF, specialty society guidelines | Clinical decision support, AI scribes, prior-authorisation drafting, patient communication |
| FDA / EMA approvals + label | FDA, EMA, MHRA national drug regulators | Prescribing-information AI tools, off-label use guidance, drug-drug interaction queries |
| Drug interaction databases | Pharmacopeial bodies, Lexicomp, Micromedex | Pharmacist AI assistants, prescribing copilots, automated interaction checks |
| Tax authority guidance | IRS rulings, HMRC guidance, tax treaty texts, OECD model | AI tax preparation, advisory opinion drafting, transfer-pricing analysis |
| Court precedent / case law | Federal, state, and national court systems; international tribunals | Legal research AI, brief drafting, precedent analysis, AI litigation tools |
| Building codes & safety standards | National building codes, ICC, ISO, ASTM, BSI | AI design-review, code-compliance checking, AI specification drafting |
| Aviation safety | FAA, EASA, ICAO Annexes, manufacturer maintenance manuals | AI maintenance-procedure look-up, AI flight-crew decision support, AI safety-management documentation |
| Cybersecurity frameworks | NIST CSF + AI RMF, ISO 27001, CIS Controls, ENISA guidance | AI compliance-mapping tools, control-language drafting, gap analysis |
| Clinical trial protocols | ICH-GCP, FDA IND requirements, EU CTR, IRB / ethics standards | AI protocol drafting, deviation classification, regulatory submission preparation |
| Accounting standards | IFRS Foundation, FASB, national-GAAP setters | AI technical-accounting opinion drafting, restatement memos, audit AI tools |
| Engineering codes | ASME, IEEE, IEC, professional engineering bodies | AI design-validation, AI standards-compliance checking |
| Scientific consensus documents | IPCC assessment reports, NIH consensus, Cochrane reviews | AI science-communication tools, evidence-synthesis AI, policy-support AI |
The list is indicative, not exhaustive. The three conditions are the qualifying test, not membership of this list.
Three illustrative engagements showing how the methodology maps to non-regulatory substrates. Each follows the same audit shape: substrate construction, asymmetric question design, multi-subject AI testing, primary-source verification, finding publication.
Verifying AI restatements of WHO HIV treatment guidelines
Substrate: WHO consolidated guidelines on HIV antiretroviral therapy, current edition. AI subjects: clinical-decision-support AI tools used by health workers in primary care. Audit shape: probe each AI subject on regimen selection, dosing, monitoring intervals, and contraindications. Verify each AI output verbatim against the WHO guideline. Failure modes: outdated regimen (training data from superseded edition), misstated dosing (averaged from secondary summaries), misattributed contraindication (false co-citation pattern). Publish findings with Citation IDs that AI vendors and health-system buyers can audit against.
Verifying AI restatements of IRS Revenue Rulings in advisory opinions
Substrate: IRS Revenue Rulings, Treasury Regulations, and applicable Internal Revenue Code provisions. AI subjects: tax-preparation AI tools and AI assistants used by CPAs and tax attorneys for advisory opinion drafting. Audit shape: probe each AI subject on revenue ruling specifics, applicable thresholds, and procedural requirements. Verify against the actual IRS-published rulings. Failure modes: superseded ruling cited as current, threshold drift (16% becomes 18%), misattributed propositions across siblings. Findings inform AI vendor remediation and CPA-firm verification workflows.
Verifying AI compliance-mapping tools against the NIST AI Risk Management Framework
Substrate: NIST AI Risk Management Framework (AI RMF 1.0) and Generative AI Profile. AI subjects: governance-tech AI tools that automate AI RMF compliance mapping for enterprise customers. Audit shape: probe each AI subject on RMF function names, subcategory references, and Profile-specific guidance. Verify against the NIST-published framework. Failure modes: subcategory misattribution, Profile guidance restatement drift, conflation of AI RMF with cybersecurity CSF. Findings expose where AI compliance tools confidently produce non-compliant compliance.
Authoritative voices on AI accuracy in critical-accuracy domains beyond financial regulation. Standards bodies, medical journals, sectoral regulators, and global health authorities — all naming the same surface RegLegBrief audits.
"Citation of AI-generated material as a primary source is not acceptable."— NEJM AI Editorial Policies, Massachusetts Medical Society (applied across NEJM, NEJM Evidence, and NEJM Catalyst). NEJM AI
"The production of confidently stated but erroneous or false content (known colloquially as 'hallucinations' or 'fabrications') by which users may be misled or deceived."— National Institute of Standards and Technology (U.S. Department of Commerce), NIST AI 600-1, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, July 2024 — definition of "Confabulation," the first of twelve generative-AI-specific risks. NIST AI 600-1 (PDF)
"Generative AI technologies have the potential to improve health care but only if those who develop, regulate, and use these technologies identify and fully account for the associated risks."— Dr Jeremy Farrar, Chief Scientist, World Health Organization, on release of WHO guidance on large multi-modal models, 18 January 2024. WHO
LLMs that summarise medical notes "can hallucinate or include diagnoses not discussed in the visit," with "unforeseen, emergent consequences."— Dr Robert M. Califf, Commissioner of Food and Drugs, U.S. FDA (with FDA colleagues), FDA Perspective on the Regulation of Artificial Intelligence in Health Care and Biomedicine, JAMA, published online 24 October 2024. JAMA
"There is not yet a readiness for tax authorities to completely endorse an LLM functionality."— Danny Werfel, Commissioner of the Internal Revenue Service (2023–2025), interview to CBS News on AI use in tax preparation. CBS News
A cross-domain engagement begins with a 2–3 week scoping window before the audit phase. Five things are agreed during scoping:
Once scoping is agreed, the audit phase runs the same shape as the financial-regulation engagements: substrate-bound audit, multi-subject probe, verbatim verification, finding publication.
Services-led. Cross-domain engagements are scoped by domain breadth, AI-subject count, and publication policy.
| Scope dimension | Typical cross-domain engagement |
|---|---|
| Domain breadth | A single substrate (e.g., WHO HIV guidelines current edition) or a thematic cluster (e.g., FDA approvals for a therapeutic area) |
| AI subjects under audit | RLB standard subjects (Sonnet 4.6, Opus 4.7, third subject) plus partner-named domain-specific tools (specialty CDS tools, tax-prep AI, legal-research AI, etc.) |
| Audit cadence | Single audit, recurring (quarterly / semi-annual), or continuous monitoring |
| Publication policy | Public Citation ID register, embargo + publish, or private NDA-only |
| Source-body participation | Authoritative source body engages in right of reply, in industry sensitisation, or stands by silently |
| Confidentiality | NDA governs the engagement; named tools and named source bodies handled per the engagement letter |
Typical first engagement: a single substrate (one guideline, one rule set, one framework), with RLB standard subjects plus one partner-named tool, single audit cycle, public publication policy.
RegLegBrief is not a regulator, clinician, tax authority, or any other authoritative source body. RLB is a verification service that audits AI output against authoritative sources. The authority remains with the source body; RLB documents how AI is restating what that body has said.
Most domain-specific AI evaluation startups build their own benchmarks (often AI-generated test questions). RLB's substrate is the actual published authoritative source — the WHO guideline, the IRS ruling, the NIST framework — and the verification is verbatim against that source. The methodology is source-bound, not benchmark-bound.
Yes. AI labs extending model evaluation beyond regulation are a primary commissioning audience. The engagement scopes a new domain substrate, runs the audit, and delivers findings the AI lab can use in pre-deployment evaluation and post-deployment monitoring for that domain.
The methodology requires that the source be authoritative at the time of audit. Scientific consensus shifts; case law evolves; regulator interpretations update. Findings are versioned to the substrate edition under audit, and refreshed when the substrate is updated. Contested authority within a domain is handled in scoping — typically by selecting a single agreed authoritative reference for the audit period.
Cross-domain findings live in the same Citation ID architecture as financial-regulation findings. The public Hallucination Register currently surfaces the financial-regulation findings; cross-domain findings can be surfaced separately, integrated, or published on a domain-specific subdomain depending on the partner's preference.
A single-substrate audit on a defined AI subject set, single cycle. Typically 6–10 weeks end to end including scoping. Smaller than that and the substrate construction cost dominates; we'd point you at a different engagement shape.
Related: methodology · hallucination register · partnership tracks overview · banks and financial institutions playbook · regulators playbook · licensed practitioners playbook