Reddit sues Perplexity for allegedly stealing data via scrapers

🚀 Trade $EAT on Wyde - The Future of Decentralized Trading

End Hunger • Every Trade Feeds Families

Reddit just filed a bombshell lawsuit against AI search startup Perplexity, alleging the company used third-party data scrapers to steal Reddit content and circumvent the platform's protections. The lawsuit targets not just Perplexity but three data-scraping companies that Reddit says are fueling an "industrial-scale data laundering economy" in AI. This marks Reddit's most aggressive legal push yet to monetize its treasure trove of human conversation data that's become gold for AI training.

Reddit is declaring war on AI companies that won't pay up. The social platform just dropped a lawsuit against Perplexity AI and three data-scraping companies, accusing them of running an elaborate scheme to steal Reddit's most valuable asset - millions of authentic human conversations.

The complaint filed today reads like a heist movie. Reddit compares the defendants to "would-be bank robbers" who "knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead." The armored truck in this case? Third-party scrapers SerpApi, Oxylabs, and AWMProxy that Reddit says Perplexity hired to do its dirty work.

Here's where it gets juicy. Reddit sent Perplexity a cease-and-desist letter back in May 2024, demanding the AI search company stop scraping Reddit data. Perplexity promised they'd play nice and respect Reddit's robots.txt file. But according to the lawsuit, the volume of Reddit citations on Perplexity actually went up after that conversation.

Reddit didn't stop there. They set a trap - creating a post that could only be crawled by Google. "Within hours," the lawsuit claims, Perplexity had somehow accessed and used that content in its answer engine. "The only way that Perplexity could have obtained that Reddit content," Reddit argues, "is if it and/or its co-defendants scraped Google search results."

This lawsuit hits at the heart of the AI industry's biggest tension right now. Reddit's treasure trove of human-written, community-ranked content is exactly what AI companies need to train better models. But Reddit learned its lesson from the 2023 API pricing controversy that sparked massive protests - if AI companies want the data, they need to pay for it.

The strategy's been working. Reddit has already struck lucrative deals with OpenAI for ChatGPT integration and Google for AI training access. The company reportedly wants even better terms as these partnerships come up for renewal.

"AI companies are locked in an arms race for quality human content," Ben Lee, Reddit's chief legal officer, said in a statement. "That pressure has fueled an industrial-scale 'data laundering' economy." Lee painted a picture of a shadowy ecosystem where scrapers "mask their identities, hide their locations, and disguise their web scrapers" to steal content from Google search results.

Reddit sues Perplexity for allegedly stealing data via scrapers

More in AI

Everbloom's AI Transforms Chicken Feathers Into Luxury Cashmere

Merriam-Webster Names 'Slop' 2025 Word of Year

Meet the Billionaires Selling AI Its Training Data

Grok Spreads Misinformation After Bondi Beach Shooting

Trending Now

Meta Tests Link Posting Limits for Facebook Creators

Mercedes-Benz Kills Music-Syncing Feature After Less Than a Year

UC San Diego's AI Lab Gets NVIDIA's Most Powerful Chip

Coursera Acquires Udemy in $2.5B Merger, Reshaping EdTech

Coinbase Enables First 'Trade-to-Feed' Cause Coin: $EAT (WYDE: End Hunger) Launched on Base

People Also Ask

ChatGPT Tops Apple's 2025 Download Charts in Historic First

Tech Giants Can't Agree on What to Call AI Glasses

More Articles

Google Surges at NeurIPS as Reinforcement Learning Takes Center Stage

Unconventional AI Closes $475M Seed at $4.5B Valuation

Google's Gemini AI Powers Pentagon's New GenAI.mil Platform

Cashew Research Targets $90B Market Research Industry with AI

Rivian Building In-House AI Assistant Ahead of Dec 11 Event

Amazon Ring deploys facial recognition in privacy firestorm

Reddit sues Perplexity for allegedly stealing data via scrapers

More in AI

Everbloom's AI Transforms Chicken Feathers Into Luxury Cashmere

Merriam-Webster Names 'Slop' 2025 Word of Year

Meet the Billionaires Selling AI Its Training Data

Grok Spreads Misinformation After Bondi Beach Shooting

Trending Now

Meta Tests Link Posting Limits for Facebook Creators

Mercedes-Benz Kills Music-Syncing Feature After Less Than a Year

UC San Diego's AI Lab Gets NVIDIA's Most Powerful Chip

Coursera Acquires Udemy in $2.5B Merger, Reshaping EdTech

Coinbase Enables First 'Trade-to-Feed' Cause Coin: $EAT (WYDE: End Hunger) Launched on Base

People Also Ask

Why is Reddit suing Perplexity AI?

What companies are included in Reddit's lawsuit against Perplexity?

How did Reddit catch Perplexity allegedly scraping its data?

What AI companies have licensing deals with Reddit?

ChatGPT Tops Apple's 2025 Download Charts in Historic First

Tech Giants Can't Agree on What to Call AI Glasses

More Articles

Google Surges at NeurIPS as Reinforcement Learning Takes Center Stage

Unconventional AI Closes $475M Seed at $4.5B Valuation

Google's Gemini AI Powers Pentagon's New GenAI.mil Platform

Cashew Research Targets $90B Market Research Industry with AI

Rivian Building In-House AI Assistant Ahead of Dec 11 Event

Amazon Ring deploys facial recognition in privacy firestorm

What is Reddit's strategy for monetizing AI training data?

How has Perplexity responded to Reddit's lawsuit?