Teredacta
5,600 government redactions recovered from the Epstein documents.
Live (beta)
When Congress released the Epstein/Maxwell documents, they came redacted. Blacked-out names, hidden paragraphs, entire pages obscured. Standard government transparency theater. But here's the thing about bureaucracies — they're not consistent. The same document gets released by different agencies, at different times, with different redactions. What the FBI blanks out, the DOJ might leave in. What's hidden in one batch appears in another.
Teredacta exploits that inconsistency. Feed it every version of every document, and it finds the overlaps, the gaps, the places where one agency's redaction reveals what another agency hid. So far: 1.4 million documents ingested. 15,000+ document groups identified. 5,600+ redacted passages recovered.
What's Been Found
This isn't theoretical. The recovered text includes FBI evidence log entries, Ghislaine Maxwell's drafts of a Vanity Fair PR response, an MCC psychologist's notes from a debate about Jeffrey Epstein just ten days before his death, staff interview lists compiled two days after, and references to missing teenagers in flight logs that someone decided the public shouldn't see.
How It Works
The system is two pieces. Unobfuscator is the pipeline. It ingests documents, fingerprints them using MinHash and locality-sensitive hashing to find duplicates released under different redaction patterns, then runs multi-strategy merge algorithms to recover what was hidden. Think of it as diff-for-redactions at massive scale.
Teredacta is the web interface. An Entity Explorer maps recovered people, organizations, locations, and contact details into an interactive graph. A recovery viewer highlights exactly what was hidden and what was recovered. Full-text search across everything. Two repos, one mission. Both are open source.