Handshake, the $3.3 billion AI data labeling platform, just acquired Cleanlab in a strategic acquihire that brings nine key researchers in-house. The deal, which beat out competing offers from other AI data labeling companies, signals intensifying consolidation in the critical infrastructure powering today's AI models. Cleanlab's MIT-trained team pioneered algorithms that automatically audit data quality without human reviewers, a capability Handshake needs as it races toward hundreds of millions in revenue serving eight major AI labs including OpenAI.
Handshake just made a calculated bet on AI data quality, and it beat out several rivals to do it. The company, best known for connecting college students with employers since 2013, has acquired Cleanlab, a startup focused on auditing the accuracy of AI training data, TechCrunch reports.
The deal is structured as an acquihire, bringing nine key Cleanlab employees into Handshake's research organization. Among them are the startup's three co-founders, all MIT computer science PhDs: Curtis Northcutt, Jonas Mueller, and Anish Athalya. While financial terms weren't disclosed, acquihires can be surprisingly lucrative for founders, especially when multiple suitors are involved.
Cleanlab wasn't hurting for options. Northcutt told TechCrunch the company fielded acquisition interest from several AI data labeling competitors. But Handshake had a unique advantage: it's the talent marketplace that other data labeling companies, including Mercor, Surge, and Scale AI, use to find specialized human labelers like doctors, lawyers, and scientists for their own projects.
"If you're going to pick one, you should probably pick the source, not the middleman," Northcutt explained to TechCrunch. It's a telling comment about the power dynamics in AI's infrastructure layer, where Handshake has quietly positioned itself as the talent engine behind the industry's data labeling operations.
Handshake pivoted into AI data labeling just about a year ago, launching a human labeling business to serve foundational model companies. The timing proved prescient. The company was last valued at $3.3 billion in 2022 and forecasted to end 2025 at $300 million in annualized revenue run rate. Now it's reportedly on track to reach ARR in the "high hundreds of millions" this year, according to Upstarts Media. The company already provides data for eight top AI labs, including OpenAI.
Cleanlab, founded in 2021, built software that tackles one of AI's messiest problems: ensuring the data humans label is actually correct. The startup's algorithms can automatically flag incorrect labels without requiring a second human reviewer, a capability that becomes increasingly valuable as AI companies scale their training data operations into the billions of data points.
"We have an in-house research team that thinks a lot about where our models are weak, what data should we be producing? How high quality is that data?" Sahil Bhaiwala, Handshake's chief strategy and innovation officer, told TechCrunch. "The Cleanlabs team has been focusing on this problem for years."
That expertise didn't come cheap to build. Cleanlab raised $30 million total from investors including Menlo Ventures, TQ Ventures, Bain Capital Ventures, and Databricks Ventures. At its peak, the startup employed more than 30 people. The fact that Handshake is bringing just nine of them suggests a highly targeted talent acquisition focused on the core research team rather than the full operational staff.
The acquisition reflects broader consolidation in AI's critical infrastructure layer. As competition intensifies among frontier AI labs, the companies providing the picks and shovels, data labeling and quality assurance tools, are merging capabilities to serve increasingly demanding customers. Data quality has emerged as a key bottleneck in AI development, with models only as good as the training data fed into them.
For Handshake, the deal solves a strategic problem. The company can source expert labelers at scale, but ensuring those labelers produce consistently accurate work requires sophisticated quality control systems. Cleanlab's automated auditing technology, combined with its team's deep expertise in machine learning and data quality, fills that gap.
The competitive dynamics here are worth noting. Handshake now owns technology that could theoretically audit the quality of data produced by competitors who rely on Handshake's own marketplace to find labelers. Whether that creates conflicts or simply cements Handshake's position as the dominant platform in the space remains to be seen.
What's clear is that AI companies are willing to pay premium prices for talent that can solve data quality challenges. As models grow larger and more capable, the marginal value of higher-quality training data increases dramatically. A small improvement in data accuracy can translate to meaningful performance gains in the resulting AI model, making teams like Cleanlab's increasingly valuable strategic assets.
The Cleanlab acquisition marks a pivotal moment in AI infrastructure consolidation, where companies that control both the talent marketplace and quality assurance tools hold increasingly outsized leverage. For Handshake, adding automated data auditing capabilities to its expert labeler network creates a vertically integrated operation that competitors will struggle to replicate. As AI labs pour billions into model development, the unsexy business of data labeling and quality control is emerging as one of the industry's most strategically valuable positions. Watch for more M&A activity in this space as companies race to build complete data pipelines from sourcing through validation.