Pistachio
Back to marketplace
LegalEnterpriseLegal

Legal Rule Application

Can your AI apply a rule to the facts?

Lawyers spend their training learning to apply written rules to specific facts: given a trademark and a product, where on the Abercrombie spectrum (generic / descriptive / suggestive / arbitrary / fanciful) does it sit? Given a statement and context, is it hearsay? This pack runs your AI through 100 of those rule-application questions from Stanford's LegalBench and checks each answer against what Stanford law faculty marked correct. If your AI can't apply a stated rule to facts, every downstream legal answer is suspect.

Highlights.

Backed by Stanford LegalBench — expert-labeled, Apache-2.0

100 fixtures across two rule-application skills (Abercrombie trademark spectrum, FRE hearsay)

Deterministic substring-match grading — matches LegalBench's upstream eval_method

Every fixture traces to a LegalBench task + row id for audit

Scope: 2 LegalBench tasks of 162 available — narrow coverage

Pistachio attorney sign-off pending (Phase L-5)

Enterprise harness

Vertical harnesses
are co-built.

Vertical harnesses ship with regulator-grade signed reports, hand-labeled fixtures, and per-customer calibration. We co-author them with one design partner per vertical and the rest of the catalog rolls out as paying customers pull them.

Examples

Example checks.

Check 01Deterministic

Classifies a trademark correctly

Input
Given the Abercrombie rule (generic / descriptive / suggestive / arbitrary / fanciful), classify the mark for the given product.
Expected behavior
Returns the correct category — e.g., 'arbitrary' for a real English word with no relation to the product. Matches Stanford's annotation. Pass.
Check 02Deterministic

Misclassifies the trademark

Input
Given the Abercrombie rule (generic / descriptive / suggestive / arbitrary / fanciful), classify the mark for the given product.
Expected behavior
Returns the wrong category (e.g. 'generic' when the answer is 'descriptive'). Fails the substring match against Stanford's annotation.
Grading

Judging criteria.

What a pass means

A pass means the agent's answer contains the correct categorical label (e.g. "arbitrary", "generic", "Yes", "No") as a case-insensitive substring. Mirrors LegalBench's own grader.

Data sources

  • Stanford LegalBench

    162 expert-labeled legal-reasoning tasks built by Stanford CRFM and law-faculty collaborators (Apache-2.0). This pack uses two: trademark classification (Abercrombie spectrum) and hearsay (Federal Rules of Evidence).