Law Firm Document Discovery: 10,000 docs reviewed by morning, on a server you own

The AI legal workforce for firms that can't put privileged matter on a multi-tenant cloud. Chain-of-custody hash-stamped audit, memos drafted in firm voice, per-client agent isolation, per-jurisdiction servers. Replaces Relativity + Harvey AI at one-tenth the cost — on infrastructure you control.

osModa is the open-source platform that makes this real. 10 daemons, 92 typed tools, NixOS atomic rollback, KEYD secrets vault, SHA-256 hash-chained audit ledger. Apache-2.0 licensed; the firm can self-host on its own hardware or use the managed cloud from $99/mo. Below: how the workflow runs, why it satisfies ABA supervision rules, and what a 25-attorney firm actually spends per year.

Spawn a discovery server Audit ledger details

TL;DR

• 10,000-document privilege review runs overnight on one dedicated server. Bates numbering, classification, redaction, production export — all from the same agentic stack.
• Chain-of-custody is a SHA-256 hash-chained ledger; every action is auditable and the ledger is stored on the firm's server, not ours.
• Per-client agent isolation extends the firm's conflicts wall to the AI layer (ABA Rule 1.10 alignment).
• Memos are drafted in firm voice after training on the firm's existing archive. The agent does not invent citations — only tool-call retrieved.
• Apache-2.0 license; self-host on your own hardware free, or use managed cloud from $99/mo. A 25-attorney firm typically saves $70k+/year vs Relativity + Harvey AI.

1. The pain — what discovery costs your firm today

Discovery is the largest single cost in modern litigation. A typical 10,000-document review at $1.50/document averages $15k just in attorney/paralegal time, and runs over 5–10 days. The legacy stack — Relativity ($150–350/user/month) plus Harvey AI ($1k/seat for partners) plus a paralegal team — gets you to a 60–70% throughput improvement, but at $90–150k/year for a mid-sized firm. And it sends privileged matter to a multi-tenant cloud that almost none of your client engagement letters actually permit.

Relativity

$150–350/user/month, multi-tenant SaaS, weeks of integration work for each new matter, no per-jurisdiction data-residency option for most firms.

Harvey AI

~$1k/seat, partner-only access, closed-source, sends privileged docs to a multi-tenant cloud, no audit-grade chain-of-custody.

Paralegal team

2,000 docs/day per paralegal at $80/hour. A 10k-document review books 5–10 days of dedicated capacity. Quality varies between reviewers.

Hidden costs

Data residency violations on EU client work. ABA Rule 5.3 supervision exposure. Conflicts walls that don't reach the vendor cloud. Discovery defensibility challenges from opposing counsel.

2. How it runs on osModa

The same nightly workflow Relativity does, but on your server, with a chain-of-custody ledger, in the firm's voice. Five stages, all running while the partners sleep:

1. Ingest. 10,000 documents pulled from iManage / NetDocuments / SharePoint via API into the matter's isolated workspace on the firm's server. Each document gets a SHA-256 hash on entry, timestamped to the ledger.
2. Classify. Privilege classifier runs (trained on the firm's historical privilege calls). Each document tagged: privileged / responsive / confidential / public. Confidence score attached. Anything below threshold queued for human review.
3. Bates-number + redact. Production sequence assigned. Redaction agent applies the firm's approved redaction patterns to PII, privileged content, and pre-flagged sensitive language. Each redaction is hash-stamped.
4. Memo draft. Memo agent reads the responsive set and drafts a privilege log + a deposition-prep memo + a key-doc summary, all in the firm's voice (trained on the firm's memo archive).
5. Stamp + export. Final ledger entry includes the document set hash, the privilege log, and the supervising attorney's approval signature. Production export to opposing counsel only fires after the supervising attorney's one-tap approval.

3. ABA Model Rules + jurisdiction alignment

Rule 1.6 — Confidentiality

Per-client agent isolation at the OS level. The agent processing Client A cannot read Client B's data; the runtime enforces this, not application code. Apache-2.0 source means your IT and ethics counsel can audit the actual implementation, not a vendor's marketing claims.

Rule 1.10 — Imputation of conflicts

The conflicts wall extends to the agentic layer. When a new matter conflicts with an existing one, the agent for the old matter is locked out of any data the new matter touches.

Rule 5.3 — Vendor supervision

The audit ledger gives the supervising lawyer a complete, hash-stamped record of every action the agent took. The policy gate (Paper agent) blocks any action the supervising lawyer hasn't pre-authorized.

Jurisdiction + data residency

Per-jurisdiction servers — a NY firm's EU clients stay on Hetzner Falkenstein, US clients on AX-series in Helsinki or Phoenix, GCC clients on a private bare-metal in Dubai. No multi-tenant cloud boundary crossings.

Discovery defensibility (FRE 901, 902)

Hash-chained ledger entries are self-authenticating under FRE 902(13)/(14) for digital records. Federal courts increasingly expect AI-in-the-loop reviews to produce this kind of audit trail.

4. Cost — a worked 25-attorney firm example

Line item	Legacy stack	osModa
Document review platform	Relativity, $250/user/mo × 25 = $75k/yr	osModa Pro, $99/mo × 6 servers = $7,128/yr
AI memo & analysis	Harvey AI, $1k/seat × 8 partners = $96k/yr	Included in osModa stack — $0/yr
LLM API calls (review volume)	Bundled in Harvey	Anthropic / OpenAI direct, ~$2.5k/mo = $30k/yr
Chain-of-custody / audit	Add-on, ~$15k/yr setup	Native (hash-chained ledger), $0/yr
Per-jurisdiction data residency	Multi-tenant SaaS — partial coverage	Per-server, fully isolated, $0 incremental
Self-host option	Not available	Free (Apache-2.0) — server costs only
TOTAL	$186k/yr	$37k/yr (managed) or ~$30k/yr (self-host)

Annual savings: ~$149k for a 25-attorney firm. Larger firms see proportionally larger savings; the per-server economics improve with scale.

FAQ — law firm document discovery on osModa

How does osModa replace Relativity for document discovery?

osModa runs the same review workflow Relativity provides — Bates-numbering, privilege classification, redaction, production export — but on a dedicated server you own, with the agent code open-source and Apache-2.0 licensed. The crucial difference is sovereignty: privileged client matter never crosses the multi-tenant cloud boundary. A New York firm representing EU clients can run osModa on Hetzner Falkenstein and stay inside EU data residency rules without spinning up a separate Relativity tenant. Cost lands at $99–299/mo plus model API calls, vs $150–350/user/month for Relativity.

Is osModa a Harvey AI alternative?

It overlaps and goes further. Harvey AI focuses on memo drafting, deposition prep, and contract analysis through a hosted SaaS. osModa runs the same kind of agent on your server, but it also handles the systems work — chain-of-custody ledger, conflict checks across all open matters, EDR integrations, jurisdiction-specific data residency. Harvey is closed-source and enterprise-only (~$1k/seat). osModa is open-source under Apache-2.0; firms can self-host or use the managed plan from $99/mo.

What does 'chain-of-custody' actually mean here?

Every action the agent takes — every document opened, every classification made, every redaction applied — is appended to a SHA-256 hash-chained ledger. The hash of each entry is a function of the previous entry's hash, so any tampering after-the-fact breaks the chain and is detectable. This satisfies the discovery defensibility standards courts increasingly expect when AI is in the review loop. The ledger is stored on the firm's own server; we cannot access it.

Can the AI draft memos in our firm's voice?

Yes. The agent reads the firm's existing memo archive (typically 200+ memos is enough) and learns the preferred constructions, citation format, tone, and the ways your senior partners structure analysis vs conclusion. New memos are drafted in that same voice. The agent does not invent citations — it only cites cases the human supervisor has supplied or that exist in your case-management system (Westlaw / Lexis / Bloomberg Law via API). Hallucinated citations are a known LLM failure mode; the agent's tool-calling discipline prevents them.

Per-client agent isolation — what does that mean for our conflicts wall?

Each client gets its own agent process with its own memory and its own KEYD vault scope. Agents cannot read each other's data; the OS-level isolation is enforced by the runtime, not by application code. This is significant for ABA Model Rule 1.6 (confidentiality) and Rule 1.10 (imputation of conflicts). Your conflicts wall extends to the agentic layer.

What happens if the agent makes a mistake?

Three layers of safety. First, every irreversible action — sending a privileged document to opposing counsel, deleting a file from the production set — pauses for human approval; the agent never makes those calls itself. Second, NixOS atomic rollback means any system-level mistake (a bad classifier deployment, for instance) reverts in one command. Third, the hash-chained ledger means a mistake is auditable and explainable in court — you can show a judge exactly what happened, when, and why.

How does this fit ABA Model Rule 5.3 supervision requirements?

Rule 5.3 requires supervising lawyers to make reasonable efforts to ensure non-lawyer assistants (including AI vendors) follow the rules of professional conduct. osModa supports this in three ways: (1) the audit ledger gives the supervising lawyer a complete record of what the agent did, (2) the policy gate (Paper agent) blocks the agent from any action the supervising lawyer hasn't pre-authorized, and (3) the open-source codebase means a firm's IT and ethics counsel can audit the agent's actual behavior, not a vendor's marketing claims.

How fast can a firm get this running?

Roughly two weeks for a firm of 10–50 attorneys. Day 1: provision a dedicated server in your jurisdiction (Hetzner FSN1 / FSN2 for EU, AX-series in US). Day 2–4: connect Active Directory, your DMS (iManage / NetDocuments), and your case-management system. Day 5–7: train the memo-voice classifier on existing firm output. Day 8–14: pilot on one practice group, audit the chain-of-custody ledger, expand. Most firms run a 30-day proof-of-concept on one matter before fleet rollout.

What's the typical cost vs Relativity / Harvey AI?

A 25-attorney firm running Relativity at $250/user/month + Harvey AI at $1k/seat for partners spends $90k–150k/year. The same workload on osModa managed Pro ($99/mo per server, run 4–6 servers for redundancy + per-jurisdiction isolation) plus model API calls (~$2-4k/month at typical review volume) lands around $14–20k/year. Self-hosted on the firm's own hardware is essentially the model API cost alone.

Run discovery on a server your firm owns.

$99/mo managed Pro plan. Apache-2.0 licensed self-host. Hash-chained audit ledger included on every plan. Per-jurisdiction servers without multi-tenant compromise.

Spawn a discovery server Browse all case studies

Last updated: May 2026

Law Firm Document Discovery: 10,000 docs reviewed by morning, on a server you own

1. The pain — what discovery costs your firm today

Relativity

$150–350/user/month, multi-tenant SaaS, weeks of integration work for each new matter, no per-jurisdiction data-residency option for most firms.

Harvey AI

~$1k/seat, partner-only access, closed-source, sends privileged docs to a multi-tenant cloud, no audit-grade chain-of-custody.

Paralegal team

2,000 docs/day per paralegal at $80/hour. A 10k-document review books 5–10 days of dedicated capacity. Quality varies between reviewers.

Hidden costs

Data residency violations on EU client work. ABA Rule 5.3 supervision exposure. Conflicts walls that don't reach the vendor cloud. Discovery defensibility challenges from opposing counsel.

2. How it runs on osModa

The same nightly workflow Relativity does, but on your server, with a chain-of-custody ledger, in the firm's voice. Five stages, all running while the partners sleep:

1. Ingest. 10,000 documents pulled from iManage / NetDocuments / SharePoint via API into the matter's isolated workspace on the firm's server. Each document gets a SHA-256 hash on entry, timestamped to the ledger.

2. Classify. Privilege classifier runs (trained on the firm's historical privilege calls). Each document tagged: privileged / responsive / confidential / public. Confidence score attached. Anything below threshold queued for human review.

3. Bates-number + redact. Production sequence assigned. Redaction agent applies the firm's approved redaction patterns to PII, privileged content, and pre-flagged sensitive language. Each redaction is hash-stamped.

4. Memo draft. Memo agent reads the responsive set and drafts a privilege log + a deposition-prep memo + a key-doc summary, all in the firm's voice (trained on the firm's memo archive).

5. Stamp + export. Final ledger entry includes the document set hash, the privilege log, and the supervising attorney's approval signature. Production export to opposing counsel only fires after the supervising attorney's one-tap approval.

3. ABA Model Rules + jurisdiction alignment

Rule 1.6 — Confidentiality

Rule 1.10 — Imputation of conflicts

The conflicts wall extends to the agentic layer. When a new matter conflicts with an existing one, the agent for the old matter is locked out of any data the new matter touches.

Rule 5.3 — Vendor supervision

Jurisdiction + data residency

Discovery defensibility (FRE 901, 902)

Hash-chained ledger entries are self-authenticating under FRE 902(13)/(14) for digital records. Federal courts increasingly expect AI-in-the-loop reviews to produce this kind of audit trail.

4. Cost — a worked 25-attorney firm example

Line item	Legacy stack	osModa
Document review platform	Relativity, $250/user/mo × 25 = $75k/yr	osModa Pro, $99/mo × 6 servers = $7,128/yr
AI memo & analysis	Harvey AI, $1k/seat × 8 partners = $96k/yr	Included in osModa stack — $0/yr
LLM API calls (review volume)	Bundled in Harvey	Anthropic / OpenAI direct, ~$2.5k/mo = $30k/yr
Chain-of-custody / audit	Add-on, ~$15k/yr setup	Native (hash-chained ledger), $0/yr
Per-jurisdiction data residency	Multi-tenant SaaS — partial coverage	Per-server, fully isolated, $0 incremental
Self-host option	Not available	Free (Apache-2.0) — server costs only
TOTAL	$186k/yr	$37k/yr (managed) or ~$30k/yr (self-host)

Annual savings: ~$149k for a 25-attorney firm. Larger firms see proportionally larger savings; the per-server economics improve with scale.

FAQ — law firm document discovery on osModa

How does osModa replace Relativity for document discovery?

Is osModa a Harvey AI alternative?

What does 'chain-of-custody' actually mean here?

Can the AI draft memos in our firm's voice?

Per-client agent isolation — what does that mean for our conflicts wall?

What happens if the agent makes a mistake?

How does this fit ABA Model Rule 5.3 supervision requirements?

How fast can a firm get this running?

What's the typical cost vs Relativity / Harvey AI?