Getting around website paywalls with devtools alone

[+] samwillis|3 years ago|reply

Many sites don't contains the full content even if you do that.

I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

[+] DennisP|3 years ago|reply

I subscribe to a major newspaper. But I'm not going to subscribe to all major newspapers. The individual subscription model doesn't really fit a world where you can go to one site, and click links to articles from lots of different publications.

If they had a common subscription, where you pay one reasonable fee and they divide it up according to whose articles you read, I'd subscribe to that. Since they don't, I subscribe to one paper and do workarounds on the others. I feel this is ethical because if everyone did it, with a decently random distribution, then the newspapers would survive just fine. They'd make the same overall revenue as when everyone had one newspaper, showing up at their doorstep each morning.

[+] enumjorge|3 years ago|reply

I’d be more willing to subscribe if publications didn’t pull the dark pattern of making signing up fast and easy, but then requiring a call to a rep to cancel.

I used to subscribe to the nytimes but a few years ago I needed a break from news. My plan was to come back in 6-12 months, but they made me wait on the phone for 25 mins for something that should have taken a couple of minutes on their site. I cancelled and never went back.

[+] felixthehat|3 years ago|reply

https://12ft.io is another good one but yes agreed, you should cough up for the subscription when possible

[+] terrycody|3 years ago|reply

+1 for this.

This website almost succeed every time I run out of my tricks, like:

1) ESC to interrupt the page load 2) quickly hit "view mode" before the wall appears 3) add a "." behind the .com, so like .com./ 4) visit in incognito window when the tokens run out (e.g. Medium) 5) Check Google cache of this page, (you can quickly add cache: URL to visit the cache page) 6) Check archive.org cache of some lost pages 7) maybe some extensions but I seldom use them nowadays 8) before, there are some cool sites like, sorry I forgot the names, all stopped working, those websites can remove paypall

9) console tricks though I dunno.

[+] phemartin|3 years ago|reply

Is this even legal? How can sites host other ones' content, and that's ok?

- Is it fair use because it's "archiving" the web?

- Is it because it's on the open web and it's public domain?

- Or is it illegal, and people do it because they can ¯\_(ツ)_/¯

[+] vie00001|3 years ago|reply

> I'm not sure of how it works (does it subscribed to them all?) but https://archive.ph/ is a good way to see the content in those cases.

I think for search engine crawlers there are versions without a paywall so these articles can get fully indexed. Archive.ph, and similar services, might get the full content this way somehow. But I am just guessing.

[+] unknown|3 years ago|reply

[deleted]

[+] grishka|3 years ago|reply

I stumbled upon a link to an article with an interesting headline. I would like to read it, so I click the link, but there's a paywall. I have no idea what site that even is. This is the first and probably last time I'm seeing it. No way it gets a single cent from me. This just can't work when there are so many news websites competing for subscriptions.

Yes, archive.ph works most of the time, can't recommend it enough.

[+] izietto|3 years ago|reply

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there

I'd go further than your statement: I try not to read paywalled contents. Actually I don't get all these workarounds about paywalls. I'm like "they don't want me to read it? I'm not going to read it then".

[+] dkdbejwi383|3 years ago|reply

> But really, if you are regularly reading content on a site you should subscribe to support the journalists employed there.

While there are some paywalled websites that allow you to read _n_ articles per period for "free", there are many that don't. How do I know in this case whether it's worth the cost?

There are also times where I'll see a link to something behind a paywall with an interesting headline (frequently on HN), but from a publication I don't regularly read, so have no intention of a subscription. It would be nice in this case to be able to pay a one-time, small contribution.

Worth stating I don't disagree necessarily with the sentiment, there are just a few "edge cases" that make it impractical.

[+] jelangkung|3 years ago|reply

this.

bloomberg.com for instance, hides pay walled lines in empty <div>s.

the other method is to disable javascript and cookies (works on nytimes.com), or press ESC key to stop page loading before paywall kicks in (works on telegraph.co.uk) :)

[+] start123|3 years ago|reply

I just use Firefox's reader view. Does the same with just a click. If it doesn't work, just refresh in reader view and it should load properly.

[+] bebrws|3 years ago|reply

Added this comment to the post if that is ok. Let me know here if it isn't. Thank you!

[+] drewtato|3 years ago|reply

Usually on these sites, there'll be an `overflow: hidden` element that's holding all the content. If you can find and disable that CSS line, it'll work as normal. Or just save it to the Wayback Machine and read it through that.

[+] bebrws|3 years ago|reply

I am going to add this to the post if that is alright. Please let me know if I shouldn't cite your HN username on the post. You can reply here and I'll see it. Thank you, this is super helpful.

[+] prettyStandard|3 years ago|reply

Not sure if this is exactly right, but something like this should obviate the need to find the element holding all the content.

* { overflow: visible !important; }

[+] batperson|3 years ago|reply

In the case of Washington Post it just has a "position: fixed" style on the <body> element. That's usually the case with most of these scroll locking sites, one of the root parent elements will have some CSS style that you can click off.

[+] jwr|3 years ago|reply

What I find annoying about paywalled sites is that they provide the full content to Google. And Google is OK with indexing the full content, even though it is not available on the internet, and even though they explicitly forbid the practice of showing different content to a search engine from what is available publicly.

Paywalled sites are just fine, but they are not part of the open Internet, and should not pretend to be.

[+] CM30|3 years ago|reply

Yeah, 100% agree with this. It's like these sites want to have their cake and eat it, and both get the traffic the 'open web' provides while not having to actually share any of their work there.

It's like if you needed an app to view a page, yet Google had all its content indexed. Why is that (rightly) seen as unreasonable while charging users for content you provide to bots for free isn't?

[+] iicc|3 years ago|reply

Sometimes the full content is available - but only if you navigate to it via a Google results page, and you don't have existing cookies implementing a "free article limit".

[+] donohoe|3 years ago|reply

> they explicitly forbid the > practice of showing different > content to a search engine from > what is available publicly

This isn’t true. This paywall treatment is something they do allow and have worked to accommodate.

[+] mrobins|3 years ago|reply

That’s a Google problem not a publisher problem. Content should be findable and available to people who want to pay for it.

Google could easily add an option to filter out paywalled content but that would reduce clicks.

[+] _boffin_|3 years ago|reply

or... just remove the `overflow: hidden` that's most likely placed on the `<body>` or some high level `<div>`.

[+] lol768|3 years ago|reply

This is the correct answer - and then scrolling will work properly!

Not sure why this 'hack' is on the front-page.

[+] courgette|3 years ago|reply

my experience is that it stopped working a few years ago. Eg: NYT or FT. The content is nowhere to be fund on the raw html itself.

No idea how it works but it looks like actual content is loaded separately once the gates are open?

[+] retox|3 years ago|reply

A handful of sites will present the subscribers view of the page if you put a dot after the tld part of the url, i.e.

https://site.com./1235/article

Those behind Cloudflare don't seem to be vulnerable to this though.

I've emailed the sites I've found where this works and none of them have fixed it after a year.

[+] neoromantique|3 years ago|reply

>I've emailed the sites I've found where this works

Why?

[+] creamyhorror|3 years ago|reply

Why would this work? I know FQDNs sometimes require specifying a period at the end, and browsers seem to accept it as well, but I wonder why this would affect some frameworks' displayed content. Normally a site would rely on a cookie to maintain logged-in status, and that shouldn't be affected by the request URL.

[+] bilekas|3 years ago|reply

> I've emailed the sites I've found where this works and none of them have fixed it after a year.

Guess it's a feature then and not a bug.

[+] mock-possum|3 years ago|reply

Maybe some kind of common mistake when writing a pattern match

[+] _the_inflator|3 years ago|reply

A Clickbait Transformator would have opted for a headline along the lines: "This tip saves you thousands of Dollars!". ;)

[+] karrotwaltz|3 years ago|reply

I use this JS bookmarklet to remove fixed elements and restore scrolling, it works most of the time:

https://pastebin.com/qBjJHkMv

I also have one to kill all running javascript and remove all event listeners, it works wonders when you are redirected to a paywall / login page after a few seconds.

[+] albert_e|3 years ago|reply

Would you mind sharing the second script as well? Thanks

This is supposed to be saved as a Javascript Bookmarklet?

[+] arbol|3 years ago|reply

Good tip. Sites turning off scroll is one of my pet annoyances.

[+] PufPufPuf|3 years ago|reply

I recommend the browser extension "Bypass Paywalls Clean". I sometimes think about the morality of using it, but I just don't find it viable to pay all the websites where I read just a single article.

[+] denton-scratch|3 years ago|reply

> but I just don't find it viable to pay all the websites where I read just a single article.

This.

In the print days, you'd buy a newspaper; you'd have access to all the articles in that edition. I used to read a daily paper.

In the modern world, these papers expect you to pay for a newspaper just to read a single article. I dunno, perhaps they could form a "Paywall Consortium", so that I could pay a one-day fee to the consortium, and have access to Washpo, Telegraph, NYT etc. for 24 hours. Let the consortium figure out how to distribute the fees - it's not my concern.

But if you want me to buy the whole paper to read a single article, well, ain't gonna happen.

[+] kokanee|3 years ago|reply

If you're unable to scroll the page, it generally means there is a "position: fixed" css rule on the body or a wrapper element. Turn off that css rule and you can scroll through the article normally.

Publishers are slowly wising up, though. Most don't load the full article for unpaid viewers anymore.

[+] porbelm|3 years ago|reply

Yeah on the system most of the local papers use over here, the full content is not even loaded for non-subscribers. It used to be, so removing the blur div worked, but now only the headline, byline and lead text are visible :(

Guess they caught on to the "cheaters."

[+] Am4TIfIsER0ppos|3 years ago|reply

Just like with pointless loading spinners. You just delete the element which is overlaid on the complete content. Sometimes there is an overflow, opacity, or visibility attribute that needs changing. Fucking webdevs!

[+] bubblebaker|3 years ago|reply

My method is to use ublock origin extention to block third party scripts.

[+] phtrivier|3 years ago|reply

Title should be : "getting around very poorly implemented paywall (eg WaPo) with devtools alone".

As soon as your site sends the whole content of the article to the browser, you're not even trying seriously. (And Firefox "reading mode" is just much better ux than the devtools.)

[+] 71a54xd|3 years ago|reply

Your site's feature of "cool comments from HN" is awesome! Automating this with some kind of LLM and selling it would be killer.

[+] vixen99|3 years ago|reply

Does not work with https://www.spectator.co.uk/

[+] vixen99|3 years ago|reply

Did not work with https://www.spectator.co.uk/.

[+] l3mure|3 years ago|reply

Another devtools trick I've used is network throttling to allow me to copy out an article's content before JS loads.

120 comments