WAF & Bot Blocking: When Your Firewall Stops AI Crawlers

How Web Application Firewalls can accidentally block AI bots from reading your content.

What it measures

This check detects whether a Web Application Firewall (WAF) — Cloudflare, Sucuri, AWS WAF, Akamai — is blocking AI crawlers from accessing your content. WAFs protect against attacks but can accidentally block legitimate AI bots.

Why it matters for AI

When a WAF blocks an AI crawler, it typically serves a CAPTCHA challenge or 403 Forbidden response. AI bots can't solve CAPTCHAs, so your content becomes completely invisible. This is one of the most common reasons otherwise well-optimised content never appears in AI-generated answers.

User Agent	Used by	What to whitelist
GPTBot	ChatGPT / OpenAI	Allow in WAF rules
ChatGPT-User	ChatGPT browsing	Allow in WAF rules
ClaudeBot	Claude / Anthropic	Allow in WAF rules
PerplexityBot	Perplexity AI	Allow in WAF rules
Google-Extended	Google AI / Gemini	Allow in WAF rules

How to improve

Whitelist AI crawler user agents — Add an allow rule in your WAF for the bots listed above
Check your CDN settings — Cloudflare's "Bot Fight Mode" can block legitimate AI bots
Test with AI user agents — Use curl with a GPTBot user-agent string to test access
Monitor WAF logs — Look for blocked requests from legitimate AI crawlers

💡 Quick win

In Cloudflare: go to Security → WAF → Custom Rules. Create an "Allow" rule that matches user agents containing "GPTBot", "ClaudeBot", or "PerplexityBot".