# AESTECHNO — robots policy # Mirrors the legacy WordPress policy so AI training opt-outs survive cutover. User-agent: * Allow: / Disallow: /api/ Disallow: /merci/ Disallow: /en/thanks/ # Legacy WP paths — the reverse proxy returns 410 Gone for these, but # blocking here prevents future re-discovery from old backlinks (2026-05-16). # NOTE: /wp-content/uploads/ is intentionally NOT disallowed — 39 markdown # posts still reference image URLs under /wp-content/uploads/// # and the static dist serves them. We only block the plugin/theme/admin # subpaths that actually moved off WP. Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /wp-content/mu-plugins/ Disallow: /wp-includes/ Disallow: /wp-admin/ Disallow: /wp-json Disallow: /wp-login Disallow: /xmlrpc.php Disallow: /category/ Disallow: /tag/ Disallow: /author/ Disallow: /auteur/ Disallow: /feed Disallow: /comments/feed # Tracking-parameter URLs — canonical tag handles them but blocking # saves crawl budget. Wildcards are Google-specific. Disallow: /*?wbraid= Disallow: /*?gbraid= Disallow: /*?gclid= Disallow: /*?fbclid= Disallow: /*?trk= Disallow: /*?post_type= # AI crawlers we ALLOW: ones that drive citations and backlinks # (answer-engines that show source links, real-time browsers). # Policy: visibility-positive crawlers in, training-only crawlers out. User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-Web Allow: / User-agent: PerplexityBot Allow: / # Google-Extended controls Google AI Overviews + Gemini answers. # Both surface source citations and link back, so we allow it # (2026-05-17 policy switch — previous block cost ~1 GEO pt # across all articles and removed us from Google AI Overviews). User-agent: Google-Extended Allow: / # AI crawlers we DISALLOW: training-only, no citation back to source. # CCBot feeds CommonCrawl which underlies many training datasets. User-agent: CCBot Disallow: / User-agent: Bytespider Disallow: / User-agent: Amazonbot Disallow: / # anthropic-ai is Anthropic's training crawler — Claude's answer-engine # citation traffic comes via ClaudeBot (allowed above). User-agent: anthropic-ai Disallow: / User-agent: FacebookBot Disallow: / User-agent: Meta-ExternalAgent Disallow: / User-agent: cohere-ai Disallow: / User-agent: Diffbot Disallow: / User-agent: omgili Disallow: / User-agent: ImagesiftBot Disallow: / Sitemap: https://www.aestechno.com/sitemap-index.xml