Reddit sues AI company Perplexity and others for 'industrial-scale'
scraping of user comments
[October 23, 2025] By
MATT O'BRIEN
Social media platform Reddit sued the artificial intelligence company
Perplexity AI and three other entities on Wednesday, alleging their
involvement in an “industrial-scale, unlawful” economy to “scrape” the
comments of millions of Reddit users for commercial gain.
Reddit's lawsuit in a New York federal court takes aim at San
Francisco-based Perplexity, maker of an AI chatbot and “answer engine”
that competes with Google, ChatGPT and others in online search.
Also named in the lawsuit are Lithuanian data-scraping company Oxylabs
UAB, a web domain called AWMProxy that Reddit describes as a “former
Russian botnet,” and Texas-based startup SerpApi, which lists Perplexity
as a customer on its website.
It's the second such lawsuit from Reddit since it sued another major AI
company, Anthropic, in June.
But the lawsuit filed Wednesday is different in the way that it
confronts not just an AI company but the lesser-known services the AI
industry relies on to acquire online writings needed to train AI
chatbots.
“Scrapers bypass technological protections to steal data, then sell it
to clients hungry for training material. Reddit is a prime target
because it’s one of the largest and most dynamic collections of human
conversation ever created,” said Ben Lee, Reddit’s chief legal officer,
in a statement Wednesday.
The lawsuit accuses the companies of unfair competition and unjust
enrichment and alleges that some of them violated U.S. copyright laws.

Perplexity said it has not yet received the lawsuit but “will always
fight vigorously for users’ rights to freely and fairly access public
knowledge. Our approach remains principled and responsible as we provide
factual answers with accurate AI, and we will not tolerate threats
against openness and the public interest.”
SerpApi's customer success director, Ryan Schafer, said in an email: “We
strongly disagree with Reddit’s allegations and intend to vigorously
defend ourselves in court.”
Oxylabs said in a statement it was “shocked and disappointed” and “will
not hesitate to defend itself against these allegations.”
“Oxylabs’ position is that no company should claim ownership of public
data that does not belong to them,” said a statement from Denas
Grybauskas, the company's chief governance and strategy officer. “It is
possible that it is just an attempt to sell the same public data at an
inflated price.”
[to top of second column] |

The Perplexity website and logo are shown in this photo, in New
York, Friday, July 5, 2024. (AP Photo/Richard Drew, File)
 AWMProxy could not immediately be
reached for comment.
Scraping for publicly available online data is a common practice
used by businesses and researchers but Reddit compares the companies
it is suing to “would-be bank robbers” who can't get into the bank
vault, so they break into the armored truck instead. The lawsuit
alleges they are evading Reddit’s own anti-scraping measures while
also ”circumventing Google’s controls and scraping Reddit content
directly from Google’s search engine results."
Lee said that because they're unable to scrape Reddit directly,
“they mask their identities, hide their locations, and disguise
their web scrapers to steal Reddit content from Google Search.
Perplexity is a willing customer of at least one of these scrapers,
choosing to buy stolen data rather than enter into a lawful
agreement with Reddit itself.”
Reddit made a similar argument in its lawsuit against Anthropic,
alleging that the company ignored Reddit's appeals to cease using
its content. That case was initially filed in California Superior
Court but was later moved to federal court and has a hearing
scheduled for January.
Along with digitized books and news articles, websites such as
Wikipedia and Reddit are deep troves of written materials that can
help teach an AI assistant the patterns of human language.
Reddit has previously entered licensing agreements with Google,
OpenAI and other companies that are paying to be able to train their
AI systems on the public commentary of Reddit’s more than 100
million daily users.
The licensing deals helped the 20-year-old online platform raise
money ahead of its Wall Street debut as a publicly traded company
last year.
All contents © copyright 2025 Associated Press. All rights reserved
 |