top of page

Reddit Sues Perplexity AI and Others for ‘Industrial-Scale’ Data Scraping

  • Writer: Lerin Astro
    Lerin Astro
  • Oct 27, 2025
  • 2 min read

Updated: Oct 28, 2025

Reddit sues AI company Perplexity and others for ‘industrial-scale’ scraping of user comments
The Perplexity website and logo are shown in this photo, in New York

Reddit has filed a lawsuit against artificial intelligence company Perplexity AI and three other entities, accusing them of participating in an “industrial-scale, unlawful” operation to scrape millions of Reddit user comments for profit.


The lawsuit, filed in a New York federal court, targets San Francisco-based Perplexity, known for its AI chatbot and “answer engine” that competes with Google and ChatGPT. Also named are Oxylabs UAB, a Lithuanian data-scraping company; AWMProxy, described as a “former Russian botnet”; and Texas-based SerpApi, which lists Perplexity as a customer.


This marks Reddit’s second major legal battle with an AI firm after it sued Anthropic in June. However, this new case broadens its focus to include data brokers and proxy services that help AI firms acquire online material for training their models.

“Scrapers bypass technological protections to steal data and sell it to clients hungry for training material,” said Ben Lee, Reddit’s chief legal officer. “Reddit is a prime target because it’s one of the largest and most dynamic collections of human conversation ever created.”


The complaint accuses the defendants of unfair competition, unjust enrichment, and copyright violations. It also alleges that some of them circumvented Reddit’s anti-scraping systems by extracting data from Google Search results instead.

Lee added that the companies “mask their identities, hide their locations, and disguise their scrapers to steal Reddit content from Google Search,” calling Perplexity “a willing customer” that chose to buy “stolen data” rather than reach a legitimate licensing agreement.


Perplexity responded that it had not yet received the lawsuit but said it “will always fight vigorously for users’ rights to freely and fairly access public knowledge.” The company added that its approach is “principled and responsible,” emphasizing its mission to provide “accurate AI answers” while defending “openness and the public interest.”


SerpApi said it “strongly disagrees” with Reddit’s claims and intends to “vigorously defend itself.” Oxylabs expressed “shock and disappointment,” arguing that no company should “claim ownership of public data.” A statement from its executive, Denas Grybauskas, suggested Reddit might simply want to “sell the same public data at an inflated price.”


AWMProxy could not be reached for comment.


While scraping publicly available data is a widespread practice among researchers and businesses, Reddit compared the defendants’ actions to “would-be bank robbers” who, unable to enter the vault, “break into the armored truck instead.”


Reddit’s lawsuit underscores the growing tension between AI developers and content platforms over the use of publicly accessible online material for training large language models.


The company has already struck licensing deals with Google, OpenAI, and others, allowing them to legally train AI systems using Reddit’s vast archive of user discussions. These agreements have also supported Reddit’s financial growth ahead of its stock market debut last year.

Comments


SUBSCRIBE VIA EMAIL

© 2025 NextStep India  Powered and secured by Wix

bottom of page