(no title)
cmroanirgo | 3 years ago
Here on HN we've been seeing posts of alternate search engines. How will those small bespoke engines make use of IndexNow unless the website participates?
The way I see IndexNow, I'll still get crawled relentlessly by the bots I don't want crawling my site (because robots.txt never seems to apply to them unless there's a special listing explicitly for them)
So, unless you're a participating search engine, a website will still be getting crawled by low hanging fruit, not alleviating the problem.
A good compromise would be something like an RSS feed, which a site can publish, and crawlers can hit for updated changes. It would also allow easier management for those domains that have many moving parts: individual search engines can be pinged, but the search engine just grabs the changes.xml file... Or something.
rstupek|3 years ago
There already is such an "RSS" feed, its called a sitemap available at /sitemap.xml or you can alternatively list your url in the robots.txt file
firecall|3 years ago
The lack of trust means a search engine needs to know if what it's being presented in metadata is actually what's being served to the browser!
That's why we can't have nice things! :-)
unknown|3 years ago
[deleted]