Media & Publishing
AI crawl control, content monetization via Pay Per Crawl, ad fraud prevention, server-side tag management, and content provenance.
The problem
AI crawlers extract publisher content at unprecedented scale without compensation. Traditional robots.txt relies on voluntary compliance that many AI operators ignore. 4 of 5 largest media companies cite unauthorized AI scraping as a material business risk. Analysts estimate $2B/year in lost US publisher revenue from AI search.
How Cloudflare solves it
Cloudflare AI Crawl Control provides per-crawler allow/block controls with path-level granularity. AI Labyrinth traps misbehaving crawlers in generative honeypots. Web Bot Auth enables cryptographic bot identity verification. Content Signals allow granular AI usage preferences (ai-train=no, ai-input=yes, search=yes).
Products
AI Crawl Control, Bot Management, AI Labyrinth, Web Bot Auth, WAF, Content Signals, Cloudflare Radar
Customer KPIs
90%+ unauthorized AI crawler requests blocked; Data transfer saved from blocked crawling (GB/month); robots.txt compliance rate across AI crawlers; Time from detection to enforcement (minutes); AI Labyrinth compute hours wasted by misbehaving bots
The problem
Publishers negotiate individual AI licensing agreements ($1-5M/year each), requiring legal teams and manual enforcement with no infrastructure-level controls. No automated way to meter, price, and collect payment for AI crawler access at scale.
How Cloudflare solves it
Pay Per Crawl (private beta) creates an automated marketplace where publishers set per-zone prices. Cloudflare handles billing via x402 payment protocol. Bot Management provides intelligence layer with bot scores enabling tiered access: free for search, paid for AI training, blocked for unauthorized scraping.
Products
Pay Per Crawl (Private Beta), Bot Management, AI Crawl Control, Web Bot Auth, x402 Protocol, WAF, Cloudflare Radar
Customer KPIs
New revenue from AI crawl access ($/month); AI crawlers onboarded to paid tier; Conversion rate from unauthorized scraping to paid licensing; Reduction in legal costs for AI licensing; Content path value intelligence (top crawled paths by revenue)
The problem
Average publisher page executes 20-40 third-party JS tags, degrading Core Web Vitals and conversion rates. GTM holds 99.5% market share but its client-side model means every tag competes for browser resources. 200+ AI-generated 'slop sites' siphon programmatic budgets. 70%+ of marketers have had AI-related ad incidents.
How Cloudflare solves it
Cloudflare Zaraz moves all third-party script execution to the edge. Pre-built Managed Components for 40+ tools (GA4, Facebook Pixel, Google Ads, TikTok, etc.). DataLayer compatibility for drop-in GTM replacement. Built-in CMP with GDPR/IAB TCF/Google Consent Mode v2. Bot Management filters non-human traffic from analytics.
Products
Zaraz, Zaraz CMP, Zaraz Managed Components, Bot Management, Google Tag Gateway for Advertisers
Customer KPIs
1-3 second page load reduction; Core Web Vitals improvement (LCP, INP, CLS); 90%+ reduction in browser-executed third-party scripts; CMP vendor cost elimination; +7% conversion rate per 1s load improvement; Ad measurement accuracy via bot filtering
The problem
Ad blockers, ITP, and cookie deprecation cause 15-30% of conversion events to go unreported, degrading campaign optimization signals for automated bidding. Agencies managing multiple publisher clients face per-client CMP licensing costs and inconsistent measurement infrastructure.
How Cloudflare solves it
Zaraz sends conversion events server-side from edge directly to Google Ads Conversion API, Facebook Conversions API, TikTok, Pinterest, Bing. Google Tag Gateway deploys Google tags from client's own domain at no cost, recovering blocked conversion signals. HTTP Events API captures backend conversions without browser involvement.
Products
Zaraz, Google Tag Gateway, Zaraz CMP, Zaraz HTTP Events API, Zaraz Managed Components, Bot Management
Customer KPIs
15-30% more conversions tracked (signal recovery); ROAS improvement from enhanced measurement; Page load improvement across client portfolio; CMP licensing savings across managed sites; Bot traffic filtered from campaign analytics
The problem
AI crawlers waste 60-80% of crawl budget re-fetching unchanged content while missing critical updates. Current crawl patterns rely on heuristic scheduling and sitemap polling lacking real-time freshness awareness. Publishers face delayed indexing (2-7 days avg), massive infrastructure costs from redundant requests, inability to communicate content priority, and no standardized way to signal "crawl now" vs "skip." LLMs train on stale data while fresh content sits undiscovered.
How Cloudflare solves it
Content Signals provide machine-readable freshness intelligence: Change Frequency Signals (declare update cadence), Timestamp Signals (precise last-modified metadata), Priority Signals (content hierarchy levels), Diff Signals (content fingerprinting), and Freshness TTL (time-sensitive content markers). Crawler efficiency benefits: 70% reduction in redundant requests, sub-minute discovery for priority updates, bandwidth savings (crawl only what changed), improved freshness scores in AI rankings, and predictable crawl scheduling reducing server spikes.
Products
Content Signals, Crawler Hints (IndexNow), Cache API, Workers, R2, CDN
Customer KPIs
Crawl efficiency ratio (unique vs redundant requests); Time-to-index (publish to AI search availability); Crawl budget utilization; Freshness score in AI citations; Infrastructure cost per crawl; Signal adoption rate
The problem
AI-generated content is becoming indistinguishable from human-created content. EU AI Act mandates content provenance. NYT 10-K cites 'misattribution of incorrect information' as a risk. Publishers need infrastructure-level tools to prove origin, authenticity, and editorial chain of custody.
How Cloudflare solves it
C2PA Content Credentials preservation maintains provenance through image optimization. Cloudflare signs with its own C2PA certificate (Adobe trust chain), verifiable at contentcredentials.org. MCP content provenance uses Ed25519 signatures and Merkle tree proofs to attest content published by specific domain at specific time, with zero TTFB impact.
Products
C2PA Content Credentials (Images + Image Transformations), MCP Content Provenance (Emerging), Workers, Cloudflare Images, R2
Customer KPIs
% of images with verified C2PA credentials; Provenance verification success rate; MCP provenance attestations/month; Reduction in misattribution incidents; EU AI Act compliance coverage; Publisher brand trust improvement