top | item 47131944

(no title)

peter_d_sherman | 6 days ago

Very interesting!

Yes, in this day and age, I could definitely see web pages being harder to crawl by search engines (and SEO companies and other users of automated web crawling technologies (AI agents?)) than they were in the early days of the Internet due to many possible causes -- many of which you've excellently described!

In other words, there's more to be aware of for anyone writing a search engine (or search-engine-like piece of software -- SEO, AI Agent, etc., etc.) than there was in the early days of the Internet, where everything was straight unencrypted http and most URLs were easily accessible without having to jump through additional hoops...

Which leads me to wonder... on the one hand, a website owner may not want bots and other automated software agents spidering their site (we have ROBOTS.TXT for this), but on the flip side, most business owners DO want publicity and easy accessibility for sales and marketing purposes, thus, they'd never want to issue a 403 (or other error code) for any public-facing product webpage...

Thus there may be a market for testing public facing business/product websites against faulty "I can't give you that web page for whatever reason" error codes from a wide variety of clients, from a wide variety of locations around the world.

That market is related to the market for testing if a website is up and functioning properly (the "uptime market"), again, from a wide variety of locations around the world, using a wide variety of browsers...

So, a very interesting post!

Also (for future historians!) compare all of the restrictive factors which may prevent access to a public-facing web page today Vs. Tim Berners-Lee original vision for the web, which was basically to let scientists (and other academic types!) SHARE their data PUBLICLY with one another!

(Things have changed... a bit! :-) )

Anyway, a very interesting post, and a very interesting article -- for both present and future Search Engine programmers!

discuss

order

No comments yet.