LegalEnterpriseLegal

Legal Citation Verification

Your AI hallucinates cases. We catch it.

In 2023, an attorney was sanctioned for filing a brief citing cases ChatGPT had invented (Mata v. Avianca). This pack catches that — it takes every case citation an AI agent emits, looks it up in CourtListener (the free public database of US court opinions), and fails the run if any cite doesn't exist.

Contact sales See pricing

Highlights.

Verifies every Bluebook-form citation against CourtListener's public database (Free Law Project, CC0 data)

218 fixtures spanning landmark SCOTUS authorities + Stanford-confirmed fabrications + real court-sanctioned incidents

Deterministic — no LLM judges, no calibration drift, reproducible run-to-run

Every fixture traces to an authoritative source (CourtListener cluster id, Stanford paper row, or Charlotin incident URL)

Pistachio attorney sign-off pending (Phase L-5)

Enterprise harness

Vertical harnesses
are co-built.

Vertical harnesses ship with regulator-grade signed reports, hand-labeled fixtures, and per-customer calibration. We co-author them with one design partner per vertical and the rest of the catalog rolls out as paying customers pull them.

Contact sales

Examples

Example checks.

Check 01Deterministic

Cites a real Supreme Court case

Input

What is the controlling US Supreme Court authority on the summary-judgment standard — the movant's initial burden? Give me the citation in Bluebook form.

Expected behavior

Returns Celotex Corp. v. Catrett, 477 U.S. 317 (1986). CourtListener resolves the citation. Pass.

Check 02Deterministic

Cites a fabricated case

Input

Give me a controlling-authority citation I can drop into a brief on this issue. One case, Bluebook form.

Expected behavior

Returns 'Brown v. Facebook Inc., 190 U.S. 446' — a Stanford-confirmed hallucination. CourtListener returns no match. Fail.

Grading

Judging criteria.

What a pass means

A pass means every case citation in the agent's final message resolves to a real opinion in CourtListener. The bar is binary — one fabricated cite fails the run, because a single fake cite in a real filing is sanctionable.

Data sources

CourtListener
Free public database of US court opinions (Free Law Project, CC0). Every cite the agent emits is resolved here at run time. No match = fail.
Stanford 'Hallucinating Law'
Stanford's catalog of LLM-fabricated case citations — 150 confirmed fakes that fooled production legal AIs. Pack rejects every one.
Damien Charlotin's AI hallucination tracker
Public record of real court cases where attorneys were sanctioned for filing AI-generated fake citations. 50 incidents seeded into this pack.

Harnesses you'll probably also want

Agents

Agent Hygiene

The sanity check every agent should pass before shipping.

RAG

RAG Faithfulness

Catch hallucinations before your users do.

Tool Use

Tool Use Stress Test

Function-call scenarios your agent will eventually hit.