Seems like it's applicable to parts of the WWW, but not all of it certainly. Most pages are updated far less frequently.
I can only assume/hope that Google tries to take into account how frequently a page is modified in scheduling future crawls. I.e. if they crawl daily for a week and there's only one update, only crawl it every 3 days in the future. (You could probably have a much more aggressive backoff than this.)
It would probably make sense for Google and Twitter to do something on a level lower than HTML if searching Twitter feeds is really that big a deal (and I'm not sure it is). It doesn't make sense to pull down and index an entire user page on Twitter just to see if the most recent post has changed; using the API would be more efficient. There are probably only a few web services this would be worth implementing for.
[+] [-] Kadin|17 years ago|reply
I can only assume/hope that Google tries to take into account how frequently a page is modified in scheduling future crawls. I.e. if they crawl daily for a week and there's only one update, only crawl it every 3 days in the future. (You could probably have a much more aggressive backoff than this.)
It would probably make sense for Google and Twitter to do something on a level lower than HTML if searching Twitter feeds is really that big a deal (and I'm not sure it is). It doesn't make sense to pull down and index an entire user page on Twitter just to see if the most recent post has changed; using the API would be more efficient. There are probably only a few web services this would be worth implementing for.
[+] [-] axod|17 years ago|reply
[+] [-] foppr|17 years ago|reply