Question 1

What does xrobotscheck do?

Accepted Answer

A page can be told what to do by three different mechanisms: robots.txt (controls crawling), the meta robots tag, and the X-Robots-Tag HTTP header (both control indexing). They frequently disagree. xrobotscheck fetches a URL and its robots.txt, reads all three for the crawler you pick (Googlebot or Bingbot), and shows a per-directive table of which one wins and why — plus it names the classic traps.

Question 2

Why is my page noindex but still indexed?

Accepted Answer

The usual cause is the "noindex behind Disallow" trap: your page has a noindex, but robots.txt also blocks crawling of it. Because the crawler is not allowed to fetch the page, it never sees the noindex — so Google can still index the URL from external links (as a URL-only result). The fix is counter-intuitive: ALLOW crawling so the noindex can be seen, then block again once it has dropped out. xrobotscheck detects this exact case.

Question 3

Which directive does Google obey when they conflict?

Accepted Answer

robots.txt decides whether the page is crawled at all — if it is blocked, the meta robots and X-Robots-Tag are never seen. If the page is crawlable, Google combines the meta tag and the header and applies the most restrictive directive (so if either says noindex, the result is noindex). A directive aimed at a specific crawler (e.g. a googlebot meta tag, or "googlebot:" in the header) overrides the generic one for that crawler.

Question 4

Does robots.txt remove a page from Google?

Accepted Answer

No. robots.txt only controls crawling. A Disallowed URL can still appear in search results (without a description) if other pages link to it. To actually remove a page, it must be crawlable and carry a noindex.

Question 5

What is the difference between meta robots and X-Robots-Tag?

Accepted Answer

They carry the same indexing directives (noindex, nofollow, nosnippet, etc.). The meta robots tag lives in the HTML <head>, so it only works for HTML pages; the X-Robots-Tag is an HTTP response header, so it also works for PDFs, images and other non-HTML files. When both are present, the most restrictive wins.

Question 6

Can I check Googlebot vs Bingbot separately?

Accepted Answer

Yes — toggle the crawler. The robots.txt group, the meta tag name (googlebot / bingbot), and the X-Robots-Tag bot prefix are all evaluated for the crawler you choose, because a site can give different instructions to each.

Question 7

Is my data safe?

Accepted Answer

The check runs on Cloudflare and only fetches the public URL and its robots.txt; requests to private, loopback, link-local and cloud-metadata addresses are blocked, redirects are re-validated, and responses are size- and time-capped. We keep no logs of the URLs you check.

Which robots directive does Google actually obey?

The conflicts that break indexing

Why it matters

Frequently asked questions