Effective Early Case Assessment Using Topic Clustering with Automatic Indexing

Product Spotlight

4 Min Read

By:

Kristin Zmrhal

Posted:

July 20, 2022

Table of Contents

⚡️ 1-Minute DISCO Download

We all know that the volume of data and the sheer number of sources to collect are growing exponentially. Now, more than ever, it's critical to use a robust early case assessment (ECA) platform and process to evaluate what’s in your collection before bringing in the review team to tag the documents. While you may know key facts, custodians and date ranges during ECA, there are a number of “unknown unknowns.” You often have to conduct a first-pass review — which can take weeks or months — before you truly understand the contours of your data.

Throughout the review process, the legal team inevitably comes across new evidence — a flurry of emails, new witnesses, or a pivotal event — and you’re faced with the difficult decision of whether to interview or collect from new custodians and add more data to your collection, significantly increasing the cost of the case. With legacy technology, you couldn’t change course quickly when the scope of the matter evolved because the data was split among different databases. Even worse, when you found out a new fact, term, or potential witness, you had to rely on your memory to translate those back into your ECA database, which could be a different technology altogether from your review application.

Manage ECA the modern way with topic clustering

Topic clustering lets you quickly identify common themes by showing you clusters of related documents, organized by topics. The index allows you to quickly select topics that are likely relevant to your case, then promote those groups of documents to active review incrementally, ultimately saving on review costs. For example, a topic cluster that includes contextual phrases like “reasonable time” and “undue delay” is probably relevant to your breach of contract case where the opposing party took too long to fulfill its obligations. Once you’re ready to start your first-pass review, topic clustering can also help you prioritize documents relevant to your client’s or company’s main concerns.

Using topic clusters to identify relevant documents works well because it’s not dependent on getting the search terms exactly right, or even knowing them at all. DISCO’s unsupervised learning system automatically finds related documents in the background, without having to know every code word or acronym at the outset. You can rely on topic clustering to tell you that there is a group of documents related to the topic of interest without knowing every term — and you can easily pull them all into active review out of your ECA workspace as you learn more.

I wish topic clustering had been around while I was building out my teams’ ECA workflows. I’d have used it to continuously assess the relevance of the review population, and incrementally add new topics to the review stage as we learned more in the case, rather than going back to a different database to search.

Click here to get a demo of DISCO Ediscovery with topic clustering. If you're already a DISCO user, any new databases will have this feature enabled automatically. Get in touch with your DISCO representative or DISCO Desk if you have any questions.

Kristin Zmrhal

Vice President, Strategy

Kristin Zmrhal has spent over twenty years working in the legal technology industry as a consultant, advisor, project/program manager, and technologist. At DISCO, Kristin drives product strategy and innovation, with a specific focus on modernizing and improving the in-house dispute resolution process through technology. Prior to her work at DISCO, Kristin built and led Google’s Ediscovery Project Management & Operations team in Silicon Valley. Before Google, she spent many years as a consultant for several Fortune 500 companies and AMLaw 200 firms.

‍

DISCO An Agentic AI Workflow for Litigation 2026

Ready to get started with agentic AI for litigation? Through concrete, step-by-step examples, modeled around an investigation of the publically available Enron data set, we will show you exactly how to use an AI agent to build dual timelines, expose production gaps, and draft precise factual analyses — in short, how to use an agentic AI workflow from the initial pleading to pre-trial preparation.

View more resources

More industry trends and DISCO updates

Product Spotlight

April 2, 2025

Understanding the Difference Between Ediscovery and Digital Forensics

Learn the key differences between ediscovery and digital forensics, their tools, and when to use each in legal and investigative contexts.

Product Spotlight

March 24, 2025

Better Together: How to Win Cases With DISCO's Generative AI

Learn how the individual tools in our GenAI-powered litigation software work together to enable you to gain faster, more reliable results at every stage of your case.

Product Spotlight

October 15, 2024

AI-Powered Fact-Finding: The Future of Document Review with Cecilia Q&A

Learn how DISCO’s Cecilia Q&A accelerates legal discovery by delivering accurate answers and key documents in seconds using generative AI. Find facts faster and streamline document review with AI-driven insights.