Back to Blog Posts

The Defensible Standard: Driving Review Efficiency with Mature CAL Workflows

Industry & Legal Education
4 Min Read
By: 
James Park
Posted: 
June 5, 2026
social link
social link
social link

https://www.csdisco.com/blog/continuous-active-learning

avatar image 3avatar image 1avatar image 2
Get the very best in litigation technology and expert partnership
Talk to sales
⚡️ 1-Minute DISCO Download

In the old world of ediscovery, document review was a linear slog — a brute-force effort to find a needle in a haystack by touching every piece of straw. Continuous Active Learning (CAL) uses real-time machine learning to prioritize relevance as you code, ensuring your reviewers spend their time on the documents that actually matter. 

Key Quote 💬

“The principles of CAL implementation lay the foundation for the governance practices that build trust in any predictive model.”

Dive Deeper

Still on the fence about CAL? Jump to the section “Implementing CAL: What Legal Teams Should Know.”

The sheer volume of data in modern litigation has turned traditional document review into a race against time and budget. As datasets grow from gigabytes to terabytes, and legal teams ponder the adoption of generative AI (GenAI) or agentic AI for document review, we’re taking a moment to return to the mature AI technology at the foundation of contemporary doc review: Continuous Active Learning, or CAL. 

Teams familiar with Technology-Assisted Review (TAR) know that CAL represents the gold standard, offering a dynamic, AI-driven approach that learns in real time. 

In this article, we’ll explore how CAL functions, how it outperformed legacy methodologies like TAR 1.0, and how, even today, CAL is helping legal teams find the "smoking gun" faster than ever before.

What is continuous active learning?

Continuous Active Learning (CAL or TAR 2.0) is an advanced machine learning protocol used in ediscovery to identify and prioritize relevant documents. The system analyzes the text, metadata, and context of every coded document to score the remaining unreviewed population, constantly re-sorting the review queue so the most likely responsive documents are always served next. 

This results in a cycle where the model effectively pushes nonresponsive documents to the bottom of the pile, surfacing the signal through the noise.

How CAL works

CAL operates on a feedback loop that functions much like a streaming service’s recommendation engine. Every time a reviewer tags a document as responsive or nonresponsive, the CAL algorithm analyzes the text, metadata, and context of that file.

The model then scores every unreviewed document in the database based on its similarity to those tagged responsive. Instead of a static list, the review queue becomes a living organism. 

In real-time, the system reprioritizes the remaining population, serving the highest-scoring documents to the review team next. As the review progresses, the model becomes increasingly precise, effectively "clustering" relevance so that reviewers spend their time on the data that actually moves the needle on the case.

🔍 Next steps: Discover how to build a modern, AI-driven review workflow in our guide: How to Use Generative AI for Document Review.

How CAL differs from other review approaches

To appreciate CAL, you have to look at the workflows it replaced.

TAR 1.0 (Seed Set Model)

TAR 1.0 was a massive step forward, but it was rigid. It required a collection of all of the review documents before starting, a seed set, and an upfront training phase where senior attorneys coded thousands of both relevant and irrelevant documents to teach the model. 

Looking back at this approach from the AI age, it’s easy to see the enormous inefficiencies of this model – and the inevitable workflow bottleneck it produced. 

It was also rigid: If the case theory shifted or a new issue emerged mid-review, the model couldn’t adapt. You had to stop, retrain, and restart. CAL evolved to eliminate this stop-and-start nature.

Struggling with bottlenecks? DISCO’s Professional Services team is available on demand to partner with you on your case needs. Learn more.

Diving deeper into TAR 1.0: Simple Passive Learning (SPL) and Simple Active Learning (SAL)

Initially, TAR 1.0 used Simple Passive Learning (SPL) that relied on random samples to generate training sets. While effective, this meant that attorneys had to use their valuable time reviewing documents unlikely to improve the model’s performance. 

Simple Active Learning (SAL) evolved to address this inefficiency. By focusing the attorney’s review on documents that the model was least sure about, SAL allowed the model to train more quickly and with fewer documents needing human review. 

However, even with SAL, the TAR 1.0 limitations persisted, including the aforementioned bottleneck – as well as typically low accuracy (“precision”). 

CAL overcame these limitations entirely, eliminating the bottleneck, providing more flexibility to changing populations, and achieving higher precision.

🔑To understand the broader context of how predictive technology is evolving from TAR and CAL to modern GenAI, check out our companion article: TAR in the Age of GenAI.

TAR 1.0 vs. CAL/TAR 2.0 Comparison Chart

See the differences between TAR 1.0 and CAL/TAR 2.0 at a glance.

Feature

TAR 1.0

CAL

Learns from reviewers

Upfront, through training rounds

Continuously

Adapts during review

No

Yes

Best for

Mid-size and large matters

Large, complex matters

Court accepted

Yes

Yes

CAL and the ediscovery process

In the Electronic Discovery Reference Model (EDRM), the review stage historically has the biggest budgetary impact. 

CAL targets this cost center by accelerating relevance realization.

The CAL Effect: A hypothetical example:

Consider a high-stakes antitrust matter with 1 million documents. In a linear review, you might find your first smoking-gun document on day 45. 

With CAL, the system recognizes the patterns of relevant documents. By prioritizing these relevant documents, it might serve that same critical evidence on day three. By day 10, you’ve seen 80% of the relevant documents despite only reviewing 30% of the total volume. 

This allows you to walk into a settlement conference or deposition with a complete hand while the opposition is still sorting through their first 50,000 files.

👉Note: You can review the stages of EDRM in this article.

The benefits of continuous active learning

The move to CAL provided tangible advantages that change the trajectory of a case:

Review consistency and quality

Human reviewers are prone to fatigue and inconsistent interpretations. CAL serves as a high-level quality-control mechanism. 

If a document is scored by the model at a 90% relevance probability, yet tagged as nonresponsive by a reviewer, the system can flag that document for a second look, catching human errors in real-time.

DISCO's quality control feature is a good example, ensuring accuracy and efficiency with the results of your review process to keep your productions reliable and consistent.

👀Want to learn more? This mini guide shows you how to set up and QC your first genAI-powered document review.

Time savings

Prioritizing relevance means your team gets to the story of the case faster. This collapses timelines for depositions and allows partners to make strategic go or no-go decisions in weeks rather than months.

Cost reduction

The math is simple: CAL allows you to stop reviewing once you’ve reached a point of diminishing returns. If the statistical metrics show that the review has identified a reasonable recall (percentage of the total responsive documents) level (typically 70%-80%), you can terminate the review and validate, saving hundreds of thousands in review costs.

Risk management

CAL provides a transparent, auditable trail of how decisions were made. Because it uses statistical validation (such as elusion testing) to demonstrate what was left behind, it offers a quantitative evaluation of the review that is simply not done in a manual, eyes-on review.

Implementing CAL: What legal teams should know

Adopting CAL requires a shift in mindset from execution to oversight — a shift that is crucial for safely integrating the next generation of generative and agentic AI. 

Workflow integration

For a modern, AI-native review to be efficient, CAL must be fully integrated into a well-established workflow that leverages all of the available tools. This integration eliminates human bottlenecks like manual batching, creating a continuous, AI-powered loop. 

By centralizing the review feed, platforms like DISCO establish the seamless data flow required for advanced agentic AI to function. Reviewers move from coding a document (CAL) to asking a question about it (Generative AI) without ever breaking their focus.

📚Related reading: Tag Predictions: How DISCO AI Is Bringing Deep Learning to Legal Technology (Whitepaper)

Training the model for accuracy

The garbage-in-garbage-out rule is more critical than ever. Because CAL demands precise coding to train its prioritization model, effective quality control (QC) is highly important. QC methodologies, such as verifying AI-human conflicts, ensure that the model can make informed and accurate predictions based on prior coding of documents. 

Keeping the process defensible

Any technology used to filter evidence, whether it's CAL or a generative AI tool, must be transparent and auditable in court. CAL established the standard for defensibility through validation checkpoints like recall calculation and elusion tests, where a random sample of discarded documents is reviewed. 

Teams must now apply this same rigorous methodology to all AI-assisted review. Documenting the statistical validity of the CAL process and the audit trails of GenAI is what keeps your entire AI-powered strategy defensible in front of a judge.

🔍Dive deeper: Ediscovery 101: Guide to Ediscovery Rules and Best Practices

CAL’s impact on discovery

CAL ended the era of guesswork discovery. By turning the document review process into a dynamic conversation between human expertise and machine efficiency, legal teams could finally stop chasing data and start building their cases.

As data volumes continue to climb, CAL remains an effective way to find all of the needles in the haystack without having to touch every piece of straw.

See how DISCO puts continuous active learning into practice

DISCO was built to make the most complex reviews feel effortless. Our AI-native platform integrates CAL directly into the reviewer’s daily experience, ensuring you find the truth faster and more affordably.

Get a head start: Schedule a demo with our experts today.

James Park
Director of AI Consulting

I am the AI Consulting Director at DISCO, guiding our Fortune 500 and AmLaw 200 clients in leveraging technology, analytics, and expertise around electronic discovery and risk management. I've led teams in wide range of matters, including Second Requests, IP litigation, environmental litigation, FCPA inquiries, government subpoena and CID responses, and numerous other civil litigations. I've also appeared on behalf of his clients before the Department of Justice and federal courts. Prior to joining DISCO, I was a Senior Director of the Engagement Management Group at Lighthouse, where I led their Research, Modeling & Analytics group providing countless services including Technology Assisted Review, Key Document Identification, and Keyword Consulting. I received my B.S. from University of California, Davis, and my J.D. from Indiana University Maurer School of Law.

avatar image 3avatar image 1avatar image 2
Get the very best in litigation technology and expert partnership
Talk to sales
How to Use AI for IP Litigation Use Cases and Tips

This guide breaks down how to use AI for IP litigation — the workflows, tools, and best practices that help teams control costs and get to the evidence that wins cases.

View more resources
0%
100%