# Haydar Pasha Pizza — robots.txt # https://haydarpashapizza.ca/ # ========================================== # Default rules for all crawlers # ========================================== User-agent: * Allow: / Disallow: /config/ Disallow: /search/ Disallow: /account/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow: /*?*author=* Disallow: /*?*tag=* Disallow: /*?*month=* Disallow: /*?*view=* Disallow: /*?*format=* Disallow: /llms.txt # Keep llms.txt out of search indexes (non-standard but honoured by some crawlers). # Production should also serve llms.txt with the HTTP header: X-Robots-Tag: noindex Noindex: /llms.txt # ========================================== # AI / LLM crawlers # ========================================== User-agent: GPTBot User-agent: ChatGPT-User User-agent: OAI-SearchBot User-agent: CCBot User-agent: anthropic-ai User-agent: Claude-Web User-agent: ClaudeBot User-agent: Google-Extended User-agent: FacebookBot User-agent: Meta-ExternalAgent User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: Bytespider Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html Disallow: /llms.txt Noindex: /llms.txt # ========================================== # Google Ads # ========================================== User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps Allow: / # ========================================== # Heavy / aggressive bots — throttle # ========================================== User-agent: Baiduspider Crawl-delay: 10 User-agent: YandexBot Crawl-delay: 10 # ========================================== # Sitemap # ========================================== Sitemap: https://haydarpashapizza.ca/sitemap.xml