(no title)
csiegert | 1 year ago
1. What does it look like for a page to be indexed when googlebot is not allowed to crawl it? What is shown in search results (since googlebot has not seen its content)?
2. The linked page says to avoid Disallow in robots.txt and to rely on the noindex tag. But how can I prevent googlebot from crawling all user profiles to avoid database hits, bandwidth, etc. without an entry in robots.txt? With noindex, googlebot must visit each user profile page to see that it is not supposed to be indexed.
seanwilson|1 year ago
> 1. What does it look like for a page to be indexed when googlebot is not allowed to crawl it? What is shown in search results (since googlebot has not seen its content)?
It'll usually list the URL with a description like "No information is available for this page". This can happen for example if the page has a lot of backlinks, it's blocked via robots.txt, and it's missing the noindex flag.
dazc|1 year ago
If user profiles are noindexed then why should you care if google are crawling, when almost every other crawler out there does not obey robots.txt?
It's not in google's interest to waste resources on non-indexable content, you are worrying far too much about it.