
The rise of AI-generated content, also known as synthetic media, has mostly caused problems: It helps spread misinformation, steal from artists, and erode trust in what we see online. However, Cloudflare may have found a use case where artificial intelligence could help protect original content from the tentacles of AI companies.
On Wednesday, the company released AI Labyrinth, a tool that uses AI-generated content to “slow down, confuse, and waste the resources” of unauthorized AI crawlers.
Also: Chatbots are distorting news – even for paid users
Multiple studies have found that AI chatbots — including ChatGPT and Perplexity — are still accessing content from sites that block their crawlers. Cloudflare noted in the announcement that crawlers “generate more than 50 billion requests to the Cloudflare network every day or just under 1% of all web requests we see” — and how you block them matters.
“While Cloudflare has several tools for identifying and blocking unauthorized AI crawling, we have found that blocking malicious bots can alert the attacker that you are on to them, leading to a shift in approach, and a never-ending arms race,” the company explained. “We wanted to create a new way to thwart these unwanted bots, without letting them know they’ve been thwarted.”
When Cloudflare detects an unauthorized crawling request, AI Labyrinth — rather than simply blocking the crawler — links to several AI-generated web pages that look real enough to convince the crawler they’re legitimate. This way, the crawler believes it’s successfully scraped the content it was looking for, while the site’s actual data remains protected from prying eyes. The crawler also squanders computational resources, which Cloudflare also sees as a win.
Also: 10 Siri tips and tricks to make it less terrible (and more helpful)
“Cloudflare will automatically deploy an AI-generated set of linked pages when we detect inappropriate bot activity, without the need for customers to create any custom rules,” the announcement explains.
The company used Workers AI and an open-source model to create unique, human-looking synthetic pages on various topics ahead of time, as creating them on demand could result in performance lags. This “pre-generation pipeline […] sanitizes the content to prevent any XSS vulnerabilities and stores it in R2 for faster retrieval,” the company said.
AI Labyrinth only presents links to AI-generated content to AI scrapers; the content is otherwise hidden from human visitors on existing pages on the site and does not alter the site’s structure, appearance, or SEO.
Cloudflare also noted it did not want the tool to add more AI slop to the internet at large. “It is important to us that we don’t generate inaccurate content that contributes to the spread of misinformation on the internet, so the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled,” the announcement added.
Also: 10 professional developers on the true promise and peril of vibe coding
Additionally, Cloudflare believes the tool can act as a honeypot to help identify more illicit crawlers. The company noted that real human visitors are unlikely to “go four links deep into a maze of AI-generated nonsense,” and that the tool will, therefore, know based on click activity where new bots are popping up. This will in turn help AI Labyrinth better identify bad actors.
Bots have evolved to detect traditional honeypot techniques. To stay ahead, Cloudflare aims for AI Labyrinth AI to “eventually create whole networks of linked URLs that are much more realistic, and not trivial for automated programs to spot.”
How to add AI Labyrinth
AI Labyrinth could be a useful tool to try for publishers or individuals who don’t want their work used to train AI (or misrepresented by chatbots in the process).
Also: Google Maps yanks over 10,000 fake business listings – how to spot the scam
All Cloudflare customers, including those on the Free tier, can opt in to AI Labyrinth today. Simply go to your Cloudflare dashboard, navigate to the bot management section, and switch the AI Labyrinth toggle on.
Want more stories about AI? Sign up for Innovation, our weekly newsletter.
Artificial Intelligence
Source : https://www.zdnet.com/article/ai-bots-scraping-your-data-this-free-tool-gives-those-pesky-crawlers-the-run-around/