Back to Blog Posts

Ediscovery for Social Media: The Complete Guide

Emerging Data Sources
4 Min Read
By: 
Julio Ruelas
Posted: 
August 30, 2024
social link
social link
social link

https://www.csdisco.com/blog/mastering-social-media-data-for-ediscovery

avatar image 3avatar image 1avatar image 2
Get the very best in litigation technology and expert partnership
Talk to sales
⚡️ 1-Minute DISCO Download

From Facebook posts to TikTok videos, social platforms contain critical legal evidence that’s dynamic, visual, and often fleeting. The challenge? Each platform behaves differently.

📊Key Stat

Social media platforms are used by more than 4.5 billion people worldwide – and many posts are gone in 24 hours or less.

🌊Dive Deeper

Jump to “Social Media Platforms and Ediscovery Best Practices” for platform-by-platform insights that can inform your collection strategy.

Over 5 billion people participate in social media globally—and new platforms crop up every year. The scale and speed of this adoption greatly increase the potentially relevant data for a case team to investigate when conducting ediscovery.

However, the right skills and tech stack can help make handling social media data a breeze, speeding up your time to important information and evidence. 

Keep reading for our step-by-step guide on how to do ediscovery with social media, including tips to master data collection on today’s top social platforms.

how to handle complex data types in ediscovery
Unlock the complete guide of best practices for handling complex data types in ediscovery.

Understanding social media data

The vast amount of social media data falls squarely within the scope of electronically stored information (ESI) that is potentially discoverable under the Federal Rules of Civil Procedure.

The official Advisory Committee notes accompanying the amended 37(e) in the federal rules even go so far as to call out this data source explicitly: “It is important that counsel become familiar with their clients’ information systems and digital data — including social media — to address these issues.” 

Still, there is confusion as to which aspects of social media data are discoverable, and what the most defensible process is for each platform. 

Below, we’ll cover the five largest platforms: Facebook, X (formerly Twitter), YouTube, TikTok, and LinkedIn.

Ediscovery for social media: Key challenges

Social media ediscovery is now a routine part of modern litigation, but it brings unique challenges. Posts vanish, formats evolve, and meaning depends heavily on context. 

Unlike traditional documents, social content is dynamic, visual, and designed for mass engagement. Here are some of the technical, legal, and procedural hurdles to be aware of when collecting and reviewing social media in ediscovery.

Data collection

Social media content moves fast, with posts disappearing, updating, or expiring automatically. That makes timely, targeted collection critical. 

APIs and access policies vary by platform, and what’s public on one network might be private on another. Livestreams, stories, and comments may not be retained unless captured in real time. And while third-party tools help, they have their limitations. 

Without deep platform knowledge, it’s easy to miss key content or metadata in social media ediscovery.

Technical preservation

Social media content can’t simply be downloaded and stored. It has to be preserved with context, metadata, and interactive elements intact. Posts evolve‌ — ‌and across platforms like Facebook or Instagram, even small changes can complicate preservation.

To make things trickier, platform updates can break ediscovery workflows, disrupting reviewability or undermining data integrity. Smart social media ediscovery requires flexible tools that can adapt as platforms change.

Privacy and access 

Beyond technical challenges, accessing social media data raises legal concerns. Private messages, restricted posts, and metadata may require consent, a subpoena, or both. And global privacy laws like GDPR raise the stakes.

Parsing these issues becomes even more difficult when personal and professional accounts overlap. One account might hold work-related communications‌ — ‌but still be protected. A thoughtful approach helps you stay compliant and avoid over-collection.

Authentication and legal admissibility  

Courts expect evidence to be authentic, and screenshots rarely cut it. Without metadata or a clear chain of custody, even the most compelling content can be thrown out.

Proving authenticity may require more than the content itself, however. Timestamps, user details, and even forensic analysis may be needed to confirm authorship and integrity. When the source matters, so does how you collected it.

Review and analysis  

Reviewing social media content isn’t straightforward. Posts may include slang, emojis, or visuals that shift meaning depending on context. Threads, replies, and visual cues all matter.

You may need to reconstruct conversations, translate messages, or analyze videos while being aware that group chats can raise privilege issues. AI can help, but human review is still critical to understanding nuance and flag privilege issues.

Compliance and regulation  

Social media ediscovery must align with data retention laws, privacy regulations, and platform terms of service. But rules vary by jurisdiction, and improper collection can lead to sanctions or evidence exclusion.

Complicating things further, platforms may prohibit scraping or limit data access, and cross-border issues can trigger additional legal scrutiny. Defensible workflows require careful planning, legal oversight, and tools that can adapt to evolving compliance demands.

Social media platforms and ediscovery best practices

Every platform comes with its own quirks. What works for Facebook ediscovery won’t work for TikTok, and what’s visible on X today might be gone tomorrow. To build a smart collection strategy, legal teams need to understand how each platform operates. To help, we’ve highlighted key considerations and practical tips for handling social media ediscovery across today’s most-used apps.

Facebook

Founded in 2004, Facebook became the largest social network in the world by 2021 with nearly 3 billion users, half of whom were logging on daily. The platform serves as a comprehensive social network that allows users to post updates, photographs, videos, classified ads, and other content, while enabling them to "follow" one another and interact through public or private comments, messages, and "likes." 

Facebook also supports livestreams and permits the monetization of various content including videos, ads, and Facebook Marketplace commerce. With its largest audience consisting of users aged 25-34 and over 2 billion daily active users in the U.S., Facebook operates under its parent company Meta, which also owns Instagram, Threads, and WhatsApp.

Facebook in court 

Facebook, now known as Meta Platforms, has been involved in a wide range of legal cases related to privacy, antitrust, content moderation, child endangerment, and use of images without permission

Aside from Facebook’s (and Meta’s) own assorted legal embroilments, user content posted on Facebook has been used in court to show evidence of bullying and stalking, unfit parenting, hidden income in divorce proceedings, and wrongful termination, among others. 

However, not all Facebook-sourced content is admissible, and a number of challenges may have to be cleared, such as authentication of the content, or allegations of hearsay.

Ediscovery best practices for Facebook 

Several types of records from Facebook can be used as evidence in court, depending on the case and how the evidence is collected and presented. These may include:

  • Public posts. Think status updates, photos, videos (including livestreams), audio recordings, or comments (one’s own, or on other individuals’ content), which can be used to indicate the poster’s sentiments or intentions, actions, or even associations, such as membership in a specific club or group. If collected via screenshot, such evidence would need to be authenticated.
  • Private messages. These may be direct messages or chat logs, and generally require a subpoena or warrant to be accessed unless willingly provided by a participant in the conversation.  
  • Account information. This may include friend lists (useful to establish connections between individuals of interest), Facebook group memberships, and event attendance (or intent to attend – for example, accepting an invitation).
  • Metadata. Metadata may include time stamps on public or private posts, location data, advertising history, and even IP addresses. A court order is generally required for Facebook to disclose such information. 

It has been widely discussed that Facebook habitually collects vast volumes and types of data from its users, and may be compelled to share that data with government or law-enforcement agencies

If you expect that Facebook-generated content will become relevant to your matters, it is important to work with partners who understand the details of Facebook ediscovery, from collecting to parsing and searching such data.

What about Instagram?

Like Facebook, Instagram is part of Meta Platforms and shares similar data practices and privacy policies. While content formats differ‌ — ‌posts are more visual, stories disappear quickly, and DMs are mobile-first‌ — ‌the core ediscovery challenges remain: collecting ephemeral content, preserving metadata, and authenticating user interactions.

Instagram ediscovery often requires a different technical approach, especially when capturing stories, reels, or ad content. If Instagram activity could be relevant in your matter, work with a partner who can navigate both the platform's features and Meta’s legal response process.

X (formerly Twitter)

Over the last several years, X has regained prominence as a major avenue of social, political, and interpersonal discussion, operating as a social "microblogging" platform that limits messages to 280 characters. With 368 million daily active users, X faces growing potential for legally actionable content, perhaps more than any other social media application. 

The platform's influence has sparked the creation of competing microblogging alternatives in recent years, including Threads (an Instagram offshoot with 130 million monthly active users), Bluesky, and politically influenced platforms such as Truth Social, reflecting the ongoing competition in the microblogging space.

X (Twitter) in court

X has a lengthy history in court, including stalking cases, libel and slander cases, and inciting the London riots in 2011

In one high-profile case related to WikiLeaks and the 2016 presidential election, X (at the time referred to as Twitter) sought to subvert the Rule 45 subpoena based upon First Amendment rights to anonymous speech. In this case, the court ruled against X because of the narrowness of the request – which excluded personal communication and demonstrated material relevance of the user’s identity – and the fact that only X itself could directly provide the information. 

More recently, X lost an appeal of a ruling that allowed special counsel Jack Smith to access records from former President Donald Trump's X account as part of his federal election interference probe – and the Supreme Court heard a different case wherein X was found non-liable for content posted by a terrorist organization.

Ediscovery best practices for X (Twitter)

Be aware of private info that requires a subpoena or court order. While some material is publicly available, much will require either the cooperation of the account holder or, more challengingly, X itself.

Information not readily accessible to the public includes: 

  • Password 
  • Email address 
  • Phone number or address book (which helps X suggest users you know) 
  • Location information (where you’re posting from) 
  • System log data (mobile carrier, device and application IDs, IP address, browser, the referring domain, pages visited, and search terms) 
  • Specific posts (formerly “tweets”) set as private, direct messages and deleted posts

Per X’s FAQ on legal requests

“Obtaining non-public information, such as an email address used to sign up for an account or IP login information, requires a valid legal process like a subpoena, court order, or other local legal process, depending on the country that issues the request.

Requests for the contents of communications (e.g., posts, Direct Messages, media) require a valid search warrant or equivalent to be properly served on the correct X corporate entity. Law enforcement or government agents must demonstrate a higher burden of proof before a judge will authorize such a request.

For additional information on the types of legal process required to obtain specific types of account information, please see the “Types of Legal Process” section in our transparency report and X’s Guidelines for Law Enforcement.” 

Act swiftly 

The most recent 3,200 posts are visible in a timeline, and X’s advanced search function can drill down even deeper based on timing, user, and subject matter. 

If a user deletes an incriminating post, the window of time to recover it is merely 30 days. 

Be specific 

Requests for data from X must be sufficiently narrow and specific for the social media behemoth to comply. If either of these parameters is not met, X is not afraid to fight back. 

In general, the best practice with regard to X-related requests should be to ensure your request is limited to material that is clearly relevant to the case, time-bound, and not readily accessible from any other data source.

It is also important to include the following data points in any request: 

  • Username 
  • URL of the X profile 
  • Date range(s) of the requested information 
  • Details about the specific information being requested and its relevance to the case 
  • Valid email address for X to acknowledge receipt of the legal request

YouTube

YouTube serves as the world's dominant video hosting platform, with over 500 hours of user- and enterprise-generated content uploaded every minute and 2.7 billion monthly active users, making it the second-most visited website globally behind its parent company Google. 

The platform generates over $31 billion in ad revenue annually by hosting videos and real-time livestreams that span everything from the mundane to the catastrophic, with traffic skyrocketing particularly during the COVID-19 pandemic. 

While YouTube faces competition from similar platforms such as Vimeo, Vevo, and DTube, its massive scale and integration with Google's ecosystem have solidified its position as the leading video-sharing service worldwide.

Unlike other streaming platforms, YouTube relies on users to create content, and only searches for policy violations after the fact

This loose approach to content regulation has exposed the organization to criticism and misuse of the platform. It has also allowed millions of people to upload potentially relevant video content every day

Previously existing barriers to including video evidence in a matter (namely, cost and the complexity of managing video data in review) have been greatly reduced, especially with the introduction of user-friendly features such as auto-transcription and searchable time-syncing to ediscovery platforms like DISCO. 

This wealth of potentially relevant data is increasingly prominent as a result. 

YouTube in court

YouTube has faced myriad critiques ranging from copyright infringement and peddling conspiracy theories to darker things like violence and sexual exploitation of adults and minors

Legions of content moderators are bombarded by questionable material every day and strive to pull down violators. Some former moderators have sued for emotional distress

Ediscovery best practices for YouTube 

It is critically important to include user-generated content on platforms like YouTube, Vimeo, and others in your ESI scoping, but, as with any complex media, certain key steps are necessary. 

Whether you are looking to use a video as character evidence or as direct evidence of an alleged event, the content must meet the threshold of admissibility for relevance and authenticity. 

Even when the video has been admitted as evidence, there are some additional factors to consider in your digital evidence analysis: 

Take care to preserve metadata

Even if the video is still actively being hosted on the video-sharing platform, use appropriate forensic collection technology to ensure that all relevant account metadata is preserved along with the video itself. 

From an authentication standpoint, information like date and time of upload, account information, and even IP address may be germane to a case. 

Act swiftly

As with X, time is of the essence with YouTube requests. In the event the video in question was recently deleted, your counsel may be able to request a copy from YouTube directly, but these deleted files are unrecoverable after a period of a few weeks. 

Leverage AI for review

Historically, reviewing video evidence was time- and cost-prohibitive because of the high cost of converting the media to a reviewable format and the amount of billable time it would take to review tens or thousands of hours of video. 

Luckily, with today’s AI-powered tools like DISCO, every frame of audio or video content is transcribed and converted into a format that can be searched, categorized, and analyzed for words and phrases.

AI can make connections across thousands of hours of video that would have previously been impossible. 

Note: Transcriptions are only as good as the audio of the video. It’s still up to lawyers to validate that they’ve reviewed the relevant content. 

Enlist digital forensic experts to identify deepfakes

Deepfakes are AI-generated content portraying real people doing and saying things that did not actually take place. The quality of deepfakes is such that it is nearly indistinguishable from authentic video. 

Thankfully, digital forensic experts can identify certain things that are a dead giveaway that a video has been tampered with, including: 

  • Lens distortion 
  • Color filter array (CFA) artifacts 
  • Noise level and pattern anomalies 
  • Compression artifacts 
  • Editing artifacts

Although deepfakes are a relatively recent phenomenon, we can likely expect to see more statutes and case law emerge in the near future. 

The rapid evolution of AI technologies creates daunting challenges for the authentication and use of evidence in court. Already, a suit has been filed over AI use of a deceased performer’s voice, and a number of bills have been proposed to prevent malicious or inappropriate use of deepfakes. Care should be taken to authenticate any key photo or video evidence, lest it turn out to be the product of an AI tool like DALL-E or Sora.

TikTok

TikTok arrived in the U.S. in 2017 and has gained immense popularity in recent years as a social media platform that allows users to create, edit, share, and discover short videos ranging from dance challenges to insider tips on home inspections

With over 1.5 billion daily active users in the U.S. and a user base largely under 35, TikTok has become so influential that it has prompted competing platforms‌ — ‌including those that predate it‌ — ‌to adopt similar short-form video functionality, such as Instagram Reels (which has 500 million daily active users) and YouTube Shorts. Like Instagram and Facebook, TikTok also offers "live" feeds, further expanding its multimedia capabilities and cementing its position as a major force in the social media landscape.

TikTok in court

Note: At the time of this writing, the future of TikTok’s availability in the United States is uncertain due to ongoing legal disputes. TikTok remains at risk of a U.S. ban unless ByteDance divests its U.S. operations. 

The company is developing a new standalone app for U.S. users—internally called “M2”—along with a plan to spin off its U.S. business into a company owned by American investors such as Oracle, Blackstone, Andreessen Horowitz, and others; however, the deal requires approval from Chinese authorities to proceed. In the meantime, enforcement deadlines have been delayed as negotiations continue.

Regardless of the outcome, user-generated content will almost certainly start to show up in court. Much like YouTube, TikTok videos may be used as evidence of individuals’ actions and whereabouts. 

Additionally, TikTok is increasingly used to share and spread information from all around the globe. It has a popular TikTok “Live” function, wherein creators can live-stream a video feed, and receive virtual “gifts” that can be converted to real-world currency. And the economic complications don’t stop there. TikTok is generating billions of dollars in advertising revenue, popular users frequently post sponsored content, and nearly five million American businesses have a presence on the app, including some that make use of the TikTok Shop feature.

For legal practitioners, causes of actions involving TikTok could range from social media marketing liability to large-scale copyright infringement concerns. As users and corporate entities alike monetize the platform, concerns about individual likeness, song sampling, and unfair advertising practices could all birth regulatory scrutiny or large-scale litigation – especially considering the memetic nature of TikTok content, which rewards copycat behavior, and often utilizes shared filters, “trends,” and sounds (many of which come from other users’ content, or from copyrighted sources such as music or films). 

Because TikTok content is built for remixing and viral replication, issues of copyright, likeness rights, and unauthorized reuse are increasingly common. Trends often borrow filters, sounds, or creative concepts, sometimes without credit or consent. As users and brands alike monetize the platform, what starts as a dance challenge could end up as a copyright claim, especially when content is repurposed for advertising or commercial gain.

Ediscovery best practices for TikTok 

TikTok, like many next-gen social media platforms and communication applications, is nothing like a traditional “document.” 

The public data collected from TikTok posts may include: 

  • The video or photograph(s) posted
  • Description text
  • Closed caption text (whether auto-generated or manually created)
  • Dynamic user interaction data, such as likes and comments
  • Filters and sounds used to create the post
  • Metadata about the user and the posting 

In addition, TikTok stores numerous data points that are not publicly visible. These can include: 

  • Profile and post views
  • Account and viewer activity analytics
  • Direct messages (DMs)
  • Saved live-streams (with accompanying data, such as chat logs and recorded “gifts” sent to the broadcaster)
  • Creator’s revenue from TikTok

Collecting from a platform such as TikTok can involve a number of legal considerations, such as the location of any implicated users, appropriate retrieval of the post data, and establishment of chain of custody. 

If you anticipate that a matter will involve the collection and review of TikTok data, it is crucial to engage with a partner who can not only provide a platform that will facilitate intelligible review of the relevant content, but who also has experience dealing with similar situations.

LinkedIn

Launched in 2003, LinkedIn emerged as a platform for professional networking, allowing users to build online resumes and connect with potential employers and colleagues. 

LinkedIn was designed to function similarly to a digital resume, allowing users to showcase their work experience, skills, and accomplishments; however, it soon transcended its original purpose, and has begun serving as a space for industry news, thought leadership, and even virtual socializing. Many describe it as the “Facebook of business,” signifying its wide global reach and cultural significance. 

LinkedIn currently has over 900 million members in more than 200 countries and regions worldwide. The United States has the most members, with over 199 million, followed by India with 101 million.

LinkedIn in court

Although LinkedIn has not faced as many legal actions as Facebook or other social media entities, it has been involved in court cases regarding data scraping, where the legal boundaries of accessing public profile information are debated. 

Content posted on LinkedIn‌ — ‌publicly, privately, or anywhere in between‌ — ‌can potentially be used as evidence in court, just like any other social media content. For instance, LinkedIn posts have been used to establish witness credibility (or lack thereof) and in employment disputes

Ediscovery best practices for LinkedIn

Much like the other social networking platforms, LinkedIn hosts a number of data categories, including public content (posts, articles, comments, reactions), private content (chats, messages), account information, and metadata pertaining to all other types of content. 

Although for the most part, LinkedIn activity is more likely to fall on the professional, public-facing side, the business-related nature of the platform means that LinkedIn content could easily prove to be valuable, or even dispositive, in a legal action. 

LinkedIn presents itself as a tool to grow one’s business, and more than 58 million companies use LinkedIn to recruit or advertise. It is certainly feasible that, in the event any of these companies were accused of malfeasance, LinkedIn content could be brought in as evidence. 

Thus far, LinkedIn content has been used in court cases far less than evidence collected from other platforms; however, it is always worth considering this possibility and working with a partner who has experience with LinkedIn ediscovery. 

Overview: A strategic approach to social media ediscovery

When conducting social media ediscovery, understanding how your clients use these various platforms will enable you to construct a plan to manage the growing data volumes. 

Where do you look for social media data?

The key to determining which technology to investigate is to understand how relevant custodians are communicating, and on which platforms

Understanding the nature of a case, plus if and how custodians are leveraging social media, helps determine the priority of discovery.  Plus, many businesses use these platforms for their communications, so this doesn’t apply only to individual employees.

What do you look for?

Each social media data source contains a potentially voluminous amount of disparate data dating back to the inception of a user’s account. It is important to understand what your technology partner will include in their capture of such data and what metadata will be included. 

User-generated social media ESI may include: 

  • Engagement data (for example, a user’s posts, likes, and comments) 
  • Direct messages 
  • Chat logs 
  • Friends or connections 
  • Profile 
  • Log-on and posting times 
  • Location data from photographs
  • Some deleted materials 

System-generated social media ESI may include: 

  • Proprietary unique identifier 
  • Item type 
  • Parent item/thread 
  • Recipients 
  • Author/poster 
  • Linked media 
  • IP addresses
  • Location data from IoB and IoT devices

How do you collect social media ESI?

While it may be enticing to print out a screen capture of a public social media site, or even have the account owner press the “download your data” button offered by several platforms, it is important to remember that this will not necessarily include all the useful or relevant data, and may be limited to only public posts. 

Additionally, some social platforms limit what you are able to export based on the type of account a user maintains. Working with a forensic collection technology that specializes in social media collection will ensure you are able to gain access to the full scope of potentially relevant information. It is also important to work with a technology that can render ‌social media data into an easily reviewable format. 

When collecting social media data, keep in mind that the format of the collection is paramount to ease of review, and not all collections are created equally. Some third-party collections may not do a good job of presenting the information in an easily digestible and easily producible matter.

Good and consistent collection of data = easy review and production

When can you use social media ESI?

While there is ample precedent and case law to support the inclusion of social media ESI (even data that is private) in a discovery request, the requesting party still has an obligation to meet the requirements of FRCP 26(b) and demonstrate relevance to the case. 

And the bar for relevance, Federal Rule of Evidence 401 is far from high. Evidence is relevant if “it has any tendency to make a fact more or less probable than it would be without the evidence” and “the fact is of consequence in determining the action.” To ensure that this does not become a fishing expedition, the court will often limit subject matter and duration of admissible ESI. 

An additional area of concern with social ESI in particular is authentication – meaning, the account and material posted to it were actually generated by the custodian or named account owner. As with relevance, the bar is not terribly high. Federal Rule of Evidence 901 states that to establish authenticity, “the proponent must produce evidence sufficient to support a finding that the item is what the proponent claims it” and this is done via presenting the “distinctive characteristics” of an account according to 901(b)(4). These characteristics may include account name, photos of the account owner, nicknames, IP address, specific topics, or slang. Keep in mind: this type of information is only available if the data was properly collected.

Ediscovery expanded: mastering complex data from Slack to Signal and beyond

Now that you’ve learned how to do ediscovery with social media, including the nuances of social media data collection, uncover the considerations and best practices for handling other complex data types in ediscovery, including:

Download the complete guide to Ediscovery for Complex Data Types.

And, if you’re ready to collect data from collaborative data sources with DISCO, request a demo to see what we can do for you.

Julio Ruelas
Manager, Data Operations

Julio Ruelas is the manager of the Data Operations department at DISCO and has been with DISCO since 2017.  He has over 18 years of experience in ediscovery working at various vendors throughout the industry.

avatar image 3avatar image 1avatar image 2
Get the very best in litigation technology and expert partnership
Talk to sales
Case Study: DISCO Auto Review

Seeking an efficient solution, a top lit boutique partnered with DISCO to implement Auto Review, a generative AI solution that reduces traditional bottlenecks associated with human-led review while maintaining accurate results.

View more resources
0%
100%