Effective Early Case Assessment Using Topic Clustering with Automatic Indexing

Back to Blog Posts

We all know that the volume of data and the sheer number of sources to collect are growing exponentially. Now, more than ever, it's critical to use a robust early case assessment (ECA) platform and process to evaluate what’s in your collection before bringing in the review team to tag the documents. While you may know key facts, custodians and date ranges during ECA, there are a number of “unknown unknowns.” You often have to conduct a first-pass review — which can take weeks or months — before you truly understand the contours of your data. 

Throughout the review process, the legal team inevitably comes across new evidence — a flurry of emails, new witnesses, or a pivotal event — and you’re faced with the difficult decision of whether to interview or collect from new custodians and add more data to your collection, significantly increasing the cost of the case. With legacy technology, you couldn’t change course quickly when the scope of the matter evolved because the data was split among different databases. Even worse, when you found out a new fact, term, or potential witness, you had to rely on your memory to translate those back into your ECA database, which could be a different technology altogether from your review application.

Manage ECA the modern way with topic clustering 

Topic clustering lets you quickly identify common themes by showing you clusters of related documents, organized by topics. The index allows you to quickly select topics that are likely relevant to your case, then promote those groups of documents to active review incrementally, ultimately saving on review costs. For example, a topic cluster that includes contextual phrases like “reasonable time” and “undue delay” is probably relevant to your breach of contract case where the opposing party took too long to fulfill its obligations. Once you’re ready to start your first-pass review, topic clustering can also help you prioritize documents relevant to your client’s or company’s main concerns.

Using topic clusters to identify relevant documents works well because it’s not dependent on getting the search terms exactly right, or even knowing them at all. DISCO’s unsupervised learning system automatically finds related documents in the background, without having to know every code word or acronym at the outset. You can rely on topic clustering to tell you that there is a group of documents related to the topic of interest without knowing every term — and you can easily pull them all into active review out of your ECA workspace as you learn more.

I wish topic clustering had been around while I was building out my teams’ ECA workflows. I’d have used it to continuously assess the relevance of the review population, and incrementally add new topics to the review stage as we learned more in the case, rather than going back to a different database to search.

Click here to get a demo of DISCO Ediscovery with topic clustering. If you're already a DISCO user, any new databases will have this feature enabled automatically. Get in touch with your DISCO representative or DISCO Desk if you have any questions.

Subscribe to the blog
Kristin Zmrhal