Consent-to-Crawl

What is it?

Consent-to-Crawl is Cloudflare's framework for giving website owners the ability to set terms for how AI bots access their content. While the companion product AI Audit (AI Crawl Control) provides visibility and blocking, Consent-to-Crawl focuses on the consent and monetization layer — managed robots.txt, the Content Signals standard, and Pay Per Crawl. The idea: instead of just blocking AI crawlers or letting them take everything for free, website owners should be able to declare their preferences and get paid when AI companies access their content.

What problem does it solve?

The explosion of AI has created a fundamental conflict between content creators and AI companies:

All-or-nothing access: Today, website owners can either allow AI crawlers or block them entirely via robots.txt. There's no middle ground — no way to say "you can use my content for search but not for training."
robots.txt is voluntary: Even when website owners set robots.txt directives, compliance is voluntary. Many AI crawlers ignore them entirely, and website owners have no way to enforce their preferences at a technical level.
No compensation model: AI companies build billion-dollar products on web content, but the creators of that content receive nothing. There's been no mechanism for content owners to charge for access.
No industry standard: Every AI company crawls differently, identifies itself differently, and uses content for different purposes. There's been no shared vocabulary for expressing consent preferences.

How does it work?

Consent-to-Crawl has three layers:

Managed robots.txt: When enabled, Cloudflare generates and maintains a robots.txt file for your domain that instructs known AI crawlers to stay away from your content. If you already have a robots.txt, Cloudflare merges its AI directives with yours. This is the "express your preferences" layer.
Content Signals: An emerging standard (via contentsignals.org) that lets website owners declare granular preferences — whether they consent to AI training, AI-powered search, or AI input (chatbot answers) — directly in their robots.txt. This goes beyond simple allow/block to express how content may be used.
Pay Per Crawl (beta): The monetization layer. Website owners set a price per zone, and when an AI crawler requests content, it either presents payment intent via request headers (and gets HTTP 200 access) or receives an HTTP 402 Payment Required response with pricing information. Cloudflare acts as the merchant of record, handling payments between content owners and AI companies.

These layers work together with AI Crawl Control's enforcement features — if a crawler ignores your robots.txt preferences, you can block it via WAF rules or the AI Crawl Control dashboard.

Why it matters strategically

Consent-to-Crawl represents a new business model for the web and potentially a new revenue stream for Cloudflare. If Cloudflare becomes the standard platform where content owners set terms and AI companies pay for access, it creates a marketplace sitting on top of ~20% of all web traffic. This is the Act 4 revenue play: Cloudflare doesn't just protect the web or help people build on it — it facilitates the economic relationship between human-created content and AI systems that consume it. No other company is positioned to do this at Cloudflare's scale.

Consent-to-Crawl

What is it?

What problem does it solve?

How does it work?

Why it matters strategically

Learn more