The 7 Most Nightmare-Inducing Data Types in Document Review, Ranked

Back to Blog Posts

When it comes to document review, not all data types (or review platforms) are created equal. Anyone should be able to review a letter on an 8.5x11 page with no problems, and even emails shouldn’t be too bad for anyone who survived the 20th century (as long as you have email threading, but I digress). 

However, some data types turn into the stuff of nightmares in document review — and worse, the sheer amount of data AND the number of strange data types is growing every day. Whether it’s the impossibility of translating them to a document format, the massive file size, or the fact that they may not exist, we’ve ranked the seven scariest data types in doc review. 

Bonus: How to Use Generative AI for Document Review 💡

7. The 500-page PDF 

Sure, this is already in a document-adjacent format, but does there have to be quite so much of it? Then there’s the nightmare case scenario of a 500-page document that is actually 50 individual documents Frankensteined together — good luck finding page seven of document eight. Finally, any lag time in your review platform will get you all-too-well acquainted with the spinning wheel of death during document review (not that we know anything about that at DISCO thanks subsecond document viewing speeds).

6. Texts/SMS

Any data type that requires you to convince clients to give up their phone is already not going to be a walk in the park. (Have a client who’s unable to part with their phone? Luckily DISCO has a remote collection kit for mobile devices.)

Then, when reviewing the actual texts, it can be difficult to match up who is texting whom. With group texts in particular, you’d better hope your platform does a good job of threading so you have some way to associate conversations. 

See a text that’s supposed to have an attachment, but it’s missing? 😱😱😱

In most platforms, that means you’ll have to click into documents to find out what the associated named image is, which can be painful if you can’t easily jump from the parent to the attachment. 

Fortunately, DISCO makes it easy to review texts with clear family associations, speedy load times, beautiful renderings, and the ability to easily move back and forth between parent and attachment. DISCO will also render emojis accurately — although correctly interpreting the winky face is still your responsibility

5. Excel files

Excel may be the format by which some of us live and breathe — it’s totally normal to use Excel to track household chores, right? But the very fact that Excel spreadsheets are so multifaceted can make them a pain in discovery. From sprawling sheets to hidden cells to mystery formulas, there’s a lot of room for error. Plus redacting Excel files is its own form of purgatory — unless you’re using a platform that has native rendering, like DISCO Ediscovery, of course. 

4. Zoom data 

The tricky part about Zoom data is that not only is there video and chat, but you’ll also need transcription of the video — all for one “document” (if there is no transcript, well, prepare yourself for hours of review time and cross your fingers it won’t crash). Of course, you’ll also have to check for side conversations in Slack or Teams happening at the same time as the video, which are likely just as relevant. 

3. Ephemeral messaging 

If you find out your client has been using ephemeral messaging, make sure there isn’t a spoliation concern. Did the business start using ephemeral messaging long before litigation was reasonably anticipated? Were there proper business justifications for using ephemeral messaging? Was there a retention policy in place? Can you collect any data? In the case of ephemeral messaging, the best defense is having proactive protocols in place for collection and preservation. If you think you might have to reach out to a provider, do so right away — because they obviously are not around for long!  

2. CAD files

Since a CAD file is technically a 3D image, it exports in layers — which breaks most ediscovery platforms. To review them on those platforms, you have to 1) pay for a license for CAD, 2) download the native, and 3) open it up on your computer (which defeats the purpose of using a review platform, right?). Since they are so difficult to read, law firms often just send them to experts for analysis — but since you’re relying on outside help, this could also mean you’re going into cases without all of the facts. 

(For what it’s worth: DISCO renders CAD files as a PDF so you can see the relevant notes and search using keywords to find the CAD files you actually need to care about. If you want to review them in greater depth, you can have an expert review just the CAD files in question as opposed to every single CAD file in your database.)

1. Slack/Teams

Just look at the panel on the right and tell me how you are supposed to review that data: 

First the obvious problem with short-form messaging apps: They’re exported in a format called .JSON that is all but unintelligible in raw form. Finding a review platform that can recreate them in a readable format is essential. (Yes, the left-hand panel is what Slack looks like in DISCO. An improvement, no?). 

But Slack and Teams have many more problems beyond simply file format, namely, how do you organize it all? Is a document one day in a channel or is it the entirety of the channel? And when each channel has thousands of messages, how do you break that up? Every matter will be different. 

The upside to these short-form communication channels is that they often contain more casual conversation — and therefore, more potential evidence — than mediums like email. Although getting the full context will likely require some work — figuring out what Zoom calls were happening at the same time, or navigating the in-jokes of custom emoji — all of that hassle will likely be worth it, eventually. 

Though document review will likely never be a dream for legal practitioners, the right review platform (or managed review team) certainly means it will be less of a nightmare. Don’t let your fears get the better of you — get in touch to see how DISCO can help make even scary data types a breeze.

Subscribe to the blog
Erin Russell

Erin Russell is the senior communications manager at DISCO. She has extensive experience covering tech and AI as a journalist and editor, and her bylines include Texas Monthly, Eater, and Austin Business Journal.