Why Handshake’s Acquisition of Cleanlab Could Redefine AI’s Data Backbone

For busy readers

Handshake has acqui-hired Cleanlab — bringing nine key researchers and data quality tech in-house.
The move strengthens Handshake’s position in the $20B+ AI data labeling and quality market by improving reliability and scalability.
This acquisition reflects a broader industry shift where data quality is becoming as strategic as model architecture.

A strategic pivot in the AI supply chain

AI models are only as good as the data they’re trained on. Garbage data leads to hallucinations, bias, and unpredictable outputs — problems that have dogged large language models and other foundation AI systems. That’s where data labeling and quality assurance come in: they turn messy human annotations into dependable training material.

Handshake — founded in 2013 as a collegiate job marketplace — seized this moment roughly a year ago when it shifted into human-powered data labeling services for AI labs building cutting-edge models. What started as a niche service quickly scaled: the company was valued at $3.3 billion in 2022, served eight major AI labs including OpenAI, and was projected to move toward “high hundreds of millions” in annualized revenue in 2026.

But as AI use cases broaden, so too does the need for quality control — not just more labelers.

What Cleanlab brings — and why it mattered

Rather than acquiring Cleanlab for its product or user base, Handshake acqui-hired its talented research team — including three MIT PhD co-founders — and integrated their technology into its core R&D roster.

Cleanlab specializes in confident learning and automated auditing algorithms that can detect mislabeled or low-quality data without requiring a second human review. In practical terms, this means:

Faster quality control
Less manual overhead
Better consistency in labeled data
— all of which directly translate into more reliable training corpora for AI models.

According to Cleanlab’s CEO Curtis Northcutt, the decision to join Handshake came in part because many AI data labeling platforms already rely on Handshake’s talent pipeline, making the combined offering stronger than any middle-man alternative.

Why this matters to the AI industry

There’s a reason this acquisition is more than just another M&A headline:

? A shift from volume to quality

In the early days of AI, demand was overwhelmingly for lots of labeled data. Today, the emphasis is on trustworthy data — and automated quality checks are essential as models scale deeper into sensitive domains like healthcare, finance, and law.

? Vertical integration of data pipelines

By bringing Cleanlab’s verification tools into its fold, Handshake is combining:

Human data labeling networks
Automated quality assurance
Frontier AI research muscles
This gives it a more defensible position against rivals like Scale AI, Surge AI, and Mercor — all competing for slices of the same infrastructure market.

? Reducing reliance on manual quality control

Cleanlab’s confident learning algorithms let systems find their own mistakes, reducing the need for large quality teams and speeding up training cycles — a competitive edge in an industry where speed and accuracy both matter.

What this means for Handshake’s future

Interviewed executives at Handshake have been candid about the strategic intent: this acquisition isn’t a sideshow. As Sahil Bhaiwala, Handshake’s Chief Strategy and Innovation Officer, explained, the Cleanlab team strengthens Handshake’s internal research capabilities, helping the company understand where AI models are weak, and what data should be produced to fix those weaknesses.

Cleanlab’s co-founders — who have guided the company’s open-source data-centric tools to massive adoption and billions of cleaned datapoints — will now help shape Handshake’s AI roadmap, not just its labeling operations. Importantly, some of Cleanlab’s open-source technologies will remain available under permissive licensing, maintaining continuity for the research community that already relies on them.

Broader implications for the market

This deal highlights a notable trend: as AI systems become more powerful, the infrastructure beneath them is professionalizing and consolidating. Early on, companies competed on the volume of annotations they could deliver. Now, accuracy, efficiency, and tooling are front and center.

In a marketplace where data errors can have real-world consequences, players that master both human expertise and automated auditing are poised to lead. Handshake’s acquisition of Cleanlab suggests the next battleground isn’t just who has the most labelers — it’s who can guarantee the highest quality training data with the least friction.

End note.

AI progress doesn’t just come from bigger models — it comes from better data. By folding Cleanlab’s research into its operations, Handshake isn’t just growing bigger. It’s trying to become the company that defines what trusted, high-quality AI training data looks like in the age of foundation models. And that might be even more important than the next algorithm.