触发条件
- Pages 有一个根域名(二级域名好像没这问题)的自定义域
- 部署的 Pages 里没有
404.html,有正常的 index.html
- 仪表板的 AI Crawl Control => Robots.txt => Cloudflare managed 开着
现象
- 手动访问
xxx.com/robots.txt 的时候 index.html 的文件内容会出现在 CF 的 robots.txt 模板下面,感觉像 Pages 默认回落的逻辑也跟着执行了。大概就像这样:
# As a condition of accessing this website, you agree to abide by the following
# content signals:
...
# BEGIN Cloudflare Managed content
User-agent: *
Content-Signal: search=yes,ai-train=no
Allow: /
...
# END Cloudflare Managed Content
<!DOCTYPE html>
<html lang="zh">
...
</html>