Speak the Language of Your Case Sooner Using Topic Clustering with Automatic Indexing

Back to Blog Posts

You’ve been tasked with collecting documents and investigating certain legal issues in a new litigation. There are hundreds of thousands of potentially relevant documents, emails, and messages distributed across more than a dozen custodians. As usual, you interview a few witnesses and analyze some key documents in order to draft search terms. These search terms, you hope, will find all the relevant documents in the rest of the document population in time for you to advise your client about the company’s potential exposure.

But halfway through your review — and after you’ve committed to a timeline with your client — you learn that certain witnesses had a fun nickname for the company’s monthly compliance meeting. You also learn that some individuals used an unexpected abbreviation to describe a relevant category of business risk. You heave a sigh and draft supplemental search terms to capture the new documents, hoping that this won’t cause you to have to push back the date you promised results. 

Sometimes, search terms can fall woefully short of capturing all key documents, especially if the organization in question used specialized terminology. As any lawyer knows, your understanding of the language of the case evolves as you learn more about the facts. This can pose a challenge when strategic search decisions have to be made at the beginning of the case.

Pick up the jargon of your case faster with topic clustering 

Topic clustering can help you learn the language of the case faster. It’s like using a treatise to start your legal research instead of going case by case. A smart junior associate will always start a novel research question with a treatise, because a treatise can give you an overview of what language is generally used to describe the legal issue, and it can tell you what auxiliary legal issues are related to the topic. 

Topic clustering identifies the clusters of documents in your case and describes those clusters using helpful contextual phrases pulled from the documents. So you can see some of the common phrases in your case right away, before you’ve reviewed a single document. Here’s how that looks on the Enron data.

You can also filter your topic list by a word or phrase by clicking the magnifying glass to the top right of the topic list. For example, researching a compliance issue, I might filter my topic list by “compliance,” “committee,” or “task force” to see what compliance meetings appear in my data that I may want to follow up on. 

In the Enron data, I can see from my topic list that Enron had an “Emerging Issues Task Force” that looks like it’s worth investigating.

I wish topic clustering had been around while I was writing search terms for litigation cases. It’s another way to identify key terminology and get a handle on your case faster. 

Click here to try DISCO Ediscovery with topic clustering. If you're already a DISCO user, any new databases will have this feature enabled automatically. Get in touch with your DISCO representative or DISCO Desk if you have any questions.

Subscribe to the blog
Anush Emelianova

Anush Emelianova is a product marketing manager at DISCO.  Before joining DISCO to further the cause of legal technology and AI adoption, she spent 10 years practicing litigation and counseling clients on data breach response and data privacy compliance.