top | item 45839290

(no title)

pgalvin | 3 months ago

This is no longer true. They changed their policy to ignore robots.txt in 2017. I seem to recall that they still respected robots.txt later, though I can’t find any more information on it and may be misremembering. Currently, they do not.

discuss

order

8cvor6j844qw_d6|3 months ago

Does it mean archive.org works for any sites?

My main use for archive.is is for sites that somehow cannot be archived (a message will show up mentioning this site cannot be archive or something along these lines).

archive.is is generally pretty good in forcibly attempting to get an archive, if the HTML doesn't work, the screenshot will work fine. Although archive.is doesn't seem to handle gifs/videos.

pseudalopex|3 months ago

> Does it mean archive.org works for any sites?

They respected exclusion requests after they stopped to respect robots.txt. I don't know their policy for new exclusion requests.

Animats|3 months ago

Oh. Did not know that.