Ask HN: Weird archive.today behavior?
140 points| rabinovich | 1 month ago
The relevant JS is:
setInterval(function() {
fetch("https://gyrovague.com/?s=" + Math.round(new Date().getTime() % 10000000), {
referrerPolicy: "no-referrer",
mode: "no-cors"
});
}, 300);
Looking at this blog, there seems to be exactly one article mentioning archive.today - "archive.today: On the trail of the mysterious guerrilla archivist of the Internet" (https://gyrovague.com/2023/08/05/archive-today-on-the-trail-...), where the person running the blog digs up some information about archive's owner.So perhaps this is some kind of revenge/DOS attack attempt/deliberately wasting their bandwidth in response to this article? Maybe an attempt to silence them and force to delete their article? But if it is, then I have so many questions. Like, why would the owner of the archive do that 2.5 years after the article was published? Or why would they even do that in the first place, do they not know about Streisand effect?
I'm confused.
mastermedo|1 month ago
> in a 2012 F-Secure forum post, a “masharabinovich” complains about “my website http://archive.is/” being blacklisted. They pop up on Wikipedia as well getting told off for adding too many links to archive.is, including a mention that they’re using the Czech ISP fiber.cz
KawaiiCyborg|1 month ago
Funnily enough, they removed that from their talk page right around the time this thread got posted, their first edit in almost 6 years: https://en.wikipedia.org/wiki/Special:Contributions/Masharab...
That's a lot of coincidences...
gghffguhvc|1 month ago
Reports of FBI going hard after archive.today around the time the HN account was setup and they post an archive.today competitor. Pings on the investigative article then a post to HN saying “3 days ago” which could indicate when FBI succeeded.
The only comment by the poster on this article is a sharp clarification of what doxxing is and isn’t.
Perhaps this is just an unusual way of slowly stepping out from behind the curtain on your own quirky terms after a fantastically long tenure.
dunder_cat|1 month ago
arcfour|1 month ago
mike_d|1 month ago
dunder_cat|1 month ago
fhub|1 month ago
jijijijij|1 month ago
This is what someone trying to start a treasure hunt like game would say....
Mom! Am I an NPC? Mom! Am I real???
eli|1 month ago
crazysim|1 month ago
https://gyrovague.com/2023/08/05/archive-today-on-the-trail-...
And one where the author's cool with whoever is running archive.today.
blorg|1 month ago
rafram|1 month ago
stavros|1 month ago
AndroTux|1 month ago
NedF|1 month ago
[deleted]
1vuio0pswjnm7|1 month ago
The author of the personal blog post claimed he works for Google, who has arguably the world's most complete web archive and uses it for commercial purposes
This archive used to be publicly accessible, at least in part, at webcache.googleusercontent.com^1
The blog post compares the size of archive.today with archive.org (about 1:40, according to the author)
But it does not include a comparison to cache.googleusercontent.com
1. Bing, another Google competitor, also offered part of their own archive at cc.bingj.com during that time
aendruk|1 month ago
I’m confused.
333c|1 month ago
I can't say for sure whether this is what happened here, but it is a possible explanation.
gyrovague|1 month ago
https://gyrovague.com/2023/08/05/archive-today-on-the-trail-...
In the past week or so, I have received a GDPR takedown attempt of the archive.today blog post (which my hosting provider rightly rejected), a politely worded request to take it down (which was sadly eaten by my spam filter), and now this (thanks to the HN reader who tipped me off).
Given that the proverbial cat has been out of the bag for 2.5 years at this point, I'm genuinely puzzled as to what they're hoping to achieve, but this does not seem like a very good way of going about it.
opengrass|1 month ago
g-b-r|1 month ago
Do you know when it began?
And what do you think of the account reporting this being named rabinovich, and having being created months ago?
notmysql_|1 month ago
sbdaman|1 month ago
mediumdeviation|1 month ago
ideasphere|1 month ago
“Behind the complaints: Our investigation into the suspicious pressure on Archive.today”
internetter|1 month ago
jijijijij|1 month ago
Brybry|1 month ago
For example, there was some NASA debris that hit a guy's house in Florida and it was in the news. [1] Some news sites linked to a Twitter post he made with the images but he later deleted the post. [2]
The Wayback Machine has a ton of snapshots of the Twitter post but none of them render for me. [3]
But archive.today's snapshot works great. [4]
[1] https://www.bbc.com/news/articles/c9www02e49zo
[2] https://xcancel.com/Alejandro0tero/status/176872903149342722...
[3] https://web.archive.org/web/20240715000000*/https://twitter....
[4] https://archive.md/obuWr
ycombinator_acc|1 month ago
mediumdeviation|1 month ago
That said I don't think there's many non-malicious explanation for this, I would suggest writing to HN and see about blocking submissions from the domain hn@ycombinator.com
heraldgeezer|1 month ago
[deleted]
s13k|1 month ago
https://archive.is/https://gyrovague.com/2023/08/05/archive-...
nativeit|1 month ago
russian_archive|1 month ago
One has to wonder why all this tracking from administrator(s) that want to stay anonymous?
You can't trust anything hosted on archive.today because you can't trust that the content hasn't been altered in some way in the pursuit of their agenda.
ventegus|1 month ago
self_awareness|1 month ago
Barbing|1 month ago
aendruk|1 month ago
ventegus|1 month ago
Save the page now and compare a week later.
g-b-r|1 month ago
ventegus|1 month ago