top | item 45840285

(no title)

nikisweeting | 3 months ago

my personal theory is that archive.is has paid subscription accounts (legit or via botnet) to most of the major news outlets and edits the html to make the sites look not logged in. I wonder if they do it by hand or by doing something like : https://github.com/pirate/html-private-set-intersection

discuss

order

nikcub|3 months ago

in my experience it's just a headless browser with a bypass-paywalls extension

Stagnant|3 months ago

It is definitely more than that for some sites and it has to be manually managed. For example this year i've seen archive.is capture paid articles of some finnish newspapers and the layout gives away that it is logged in on an account although the identifying details have been stripped out.

There have been periods of weeks/months when they don't have paid access to those Finnish sites. Tried it just now on a hs.fi paid article from today and it didn't work, but for example paid articles from just a week ago seem to have been captured as a premium user.

It is curious how they have time to do it and I wonder if news sites of other smaller languages get similar treatment.