top | item 37117338

Bypassing YouTube video download throttling

587 points| 0x7d0 | 2 years ago |blog.0x7d0.dev | reply

227 comments

order
[+] thrdbndndn|2 years ago|reply
> To bypass this limitation, we can break the download into several smaller parts using the HTTP Range header. This header allows you to specify which part of the file you want to download with each request (eg: Range bytes=2000-3000). The following code implements this logic.

Last time I read discussion about it in yt-dlp repo [1], you can actually bypass it by just adding range=xxx query parameter (not header), and it will return to full speed even if your range is just the whole thing.

And IIRC YouTube have already lifted this restriction.

Edit: find the ref [1] https://github.com/yt-dlp/yt-dlp/issues/6400

[+] probably_wrong|2 years ago|reply
I've never tried YouTube, but I have downloaded videos from sketchier streaming websites using the web developer tools.

Almost all of them have the same protection: some code that triggers only when you open the tools and stops the video by creating a debugger statement you cannot skip and triggering some cpu-heavy code (probably an infinite loop, although I wouldn't discard cryptominers). More importantly this code also clears the network request information, making it more difficult to analyze the traffic sent so far. Note to Firefox devs: enabling "persist logs" should persist the logs. Don't clear them!

None of this is perfect and I never found a video I couldn't eventually download (timing attacks ftw), but I do wish I could find a deeper explanation on how this all works.

[+] no_time|2 years ago|reply
>Almost all of them have the same protection: some code that triggers only when you open the tools and stops the video by creating a debugger statement you cannot skip

If you missed it, not so long ago there was a submission that evaded exactly this. Their solution is so simple yet effective: Recompiling the browser with the debugger keyword renamed. Made me smile.

https://news.ycombinator.com/item?id=36961445

[+] dicytea|2 years ago|reply
> some code that triggers only when you open the tools

I've seen this technique too, but I feel that this is a major flaw on the browser's side. It should be impossible to tell if the dev tools is open or not. Surely this can be done right?

[+] lazylion2|2 years ago|reply
One trick that sometimes works for me on Firefox is shift+right click the video -> This Frame -> Open Frame in New Tab. and the dev tools work there
[+] jacobwilliamroy|2 years ago|reply
Have you tried using wireshark to analyze the traffic? That's the first idea I had since you said these pages are trying to detect your browser's developer tools.
[+] stavros|2 years ago|reply
You can use something like mitmproxy or HTTP Toolkit to examine things exactly, as those can't be interacted with or detected like the dev tools can.
[+] userbinator|2 years ago|reply
I just use the logs of my filtering proxy.
[+] PeterStuer|2 years ago|reply
In the endgame you will have to pay in WorldCoin to keep watching your screen, and you can only earn that untradable WorldCoin by viewing the ads, monitored through the mini Orb embedded in the screen.

You didn't truly believe this was about UBI did you? We solved that one ages ago with bank accounts and KYC.

[+] wdb|2 years ago|reply
Nothing with the eye?
[+] bayesianbot|2 years ago|reply
Anyone else having throttling problems with yt-dlp lately? I always watch youtube through mpv that uses yt-dlp on the background, but last week it's been terrible. It starts quickly (I've throttled it to 500kBps so that's the starting speed) but then after a while I'm getting a second of stream for three seconds of download, so I gotta queue up the video long time before playing. I'm using git version of yt-dlp and haven't noticed anything related in the git issues.
[+] jorams|2 years ago|reply
mpv only uses yt-dlp to get a video URL, then passes that URL to ffmpeg. ffmpeg doesn't implement the workarounds with range headers, so you get throttled. It's possible to have yt-dlp perform the download, pipe it to mpv and make mpv play from stdin, but it breaks seeking to parts that haven't been downloaded yet. There are many issues about this in the mpv issue tracker.
[+] rickreynoldssf|2 years ago|reply
YouTube changes small things in this process all the time. I used to work on an internal editing tool for YouTube videos that needed the MP4 files. Every month or so the editor would break because of a YouTube change and I needed to dive into the debugger to see what they changed and adjust to that.
[+] AltruisticGapHN|2 years ago|reply
On a sidenote am I imagining things or do videos actually look a tiny bit better in YouTube?

This really has puzzled me. I downloaded a few favourites and watch them on VLC or Infuse on my AppleTV. In the YouTube app I can use the "nerd stats" to confirm I am viewing the exact same video/audio streams...

... it could be my imagination but it seems like YouTube does a really subtle kind of filter that makes the "blocky" compression artifacts smoother. It doesnt ehance edges or anything - my guess is it looks for areas WITHOUT edges where there are sublte shfts of colour, and it makes the blocky artifacts less prominent.

... it's really subtle and I still cant tell if its just my imagination, like my OCD thinking that my downloaded video doesnt look as good and yet, I noticed on YouTube the video feels more vibrant and solid. When I watch my downloaded vid there are these really sublte, but noticable artifacts often in the background , in the shadows and these constant tiny little jitters even on a 1440p video - make the final picture look not as good.

Am I making this up?

Audio wise there is definitely a change as well. YouTube audio is always more or less level for me, while a downloaded video always needs to crank up the volume which is annoying.

I wish players like VLC or Infuse did whatever YouTube does to make videos just more pleasant to look at. I dont think YoUTube changes the colours or does any kind of vibrancy filter though I may be wrong, but it does things to "level" audio so that you have a more consistent experience going from one video/channel to another.

[+] djkoolaide|2 years ago|reply
I can answer about the audio.

YouTube (and most other streaming sites like Spotify etc) use something called ReplayGain. It's essentially a tag that specifies the calculated average loudness of the video/song/whatever (this number is calculated at upload time).

Upon playback, the official YT client knows to use that tag and adjust its volume level accordingly, but I'd imagine either the tag isn't getting downloaded, or perhaps MKV doesn't support ReplayGain tags natively.

[+] mgdlbp|2 years ago|reply
What filter the player uses for scaling (and chroma scaling) affects sharpness. Playing a native resolution stream fullscreen should eliminate differences here for monochrome edges. Depending on OS you might be able to toggle rapidly between two fullscreen apps to test for differences (cf ISO 29170-2, which recommends 5 Hz).

Colour shifts (as in, input != output, not gradients) can come from bad handling of video colour space or monitor profile. Also, shenanigans here can have screenshots looking different from the actual application.

'jitters' might be dropped frames, but then you mention resolution. Since you also mention edges, if you're noticing pixellation in the edges of coloured objects, that would be nearest-neighbour chroma upscaling, which I do remember some player using at some point.

[+] planede|2 years ago|reply
A single youtube video has a couple of available video, audio and combined streams. It might be your browser picks a different stream to what youtube-dl (or its fork) takes by default.

I'm not sure what the defaults currently are for youtube-dl and its forks, but for a long time it defaulted to the best combined stream. However the best distinct audio and video streams are higher quality.

[+] Adverblessly|2 years ago|reply
Players like VLC do much better and let you make your own arbitrary adjustments via GLSL :)

In particular, I like the anime4k shader pack, which is ML-based but runs in real time in mpv (and I think VLC as well). While it is tuned for anime (as is obvious from the name), it has decent denoise and deblur which often make YT content more watchable and a restore step that does a really good job with compression artifacts but is a bit too tuned for anime so may not always work, or even make things worse. See https://github.com/bloc97/Anime4K/releases

[+] Y-bar|2 years ago|reply
Maybe the video pipeline in the browser is different compared to the one VLC uses? Can you do a simple HTML document with a <video> element referencing your local file to compare?
[+] AltruisticGapHN|2 years ago|reply
If anyone's still reading this or anyone cares here is what I found out:

- the issue I was experiencing is with a lower quality encoding vp9 stream - from a video that was recently uploaded (explained below)

- though I didn't trust smaller filesize VP9 streams initially, they look in fact noticably better - the picture is smoother, cleaner, the artifacts of compression are less visible. Where AAC can have jittery/glittery distracting dots moving in the background in areas where you have subtle gradients (eg. a plain wall) - Vp9 has none of these, those areas look smoother and cleaner and it gives an overall nicer looking picture without compromising the detail as far I can tell

- Opus audio stream appears to have less of the ReplayGain issue, I'm not sure - but since I downloaded Opus instead of the 140 m4a stream I notice I dont need to adjust the volume compared to viewing same video in YouTube - and since the codec is newer anyway and the filesize is relatively the same or a tad smaller - also it is in 48k not 44k, I am going to download Opus from here on

- a very confusing thing is it appears ; for a recent upload ; you can have an initial VP9 stream of say 500mb which is in fact no better than the AAC and havs the grainy artifacts - and the vp9 stream gets replaced weeks later by one significantly smaller like 400mb vs 500 mb !! and looks way better . whic hsuggst there was a first pass with low quality encoding, replaced by a higher quality encoding later - therefore my assumption that larger filesize is better was wrong

[+] grotorea|2 years ago|reply
Doesn't youtube ultimately use the browser's video player now that flash is no more?

In addition to what the other poster said about checking if you're downloading the exact same codec etc as you're watching, you could try playing the downloaded video with the browser and see what it looks like.

[+] rubatuga|2 years ago|reply
macOS has an AI video upscaling algorithm that was introduced in version 12
[+] Knee_Pain|2 years ago|reply
I don't understand your reply.

Why don't you just take a screenshot at the exact same timestamp? It's that easy and would have taken you less time than writing this up.

[+] glonq|2 years ago|reply
> On a sidenote am I imagining things or do videos actually look a tiny bit better in YouTube?

I'm not [ADVERTISEMENT] sure, because my YouTube viewing [ADVERTISEMENT] experience nowadays is so [ADVERTISEMENT] [ADVERTISEMENT] frequently interrupted with ads that it [ADVERTISEMENT] breaks my focus. #pleaselikeandsubscribeandclickonthenotificationbell

[+] no_time|2 years ago|reply
I'm constantly suprised when YT deploys another half measure against downloaders when GOOG also owns widevine. I wonder what is their reasons for not using it.
[+] Klaus23|2 years ago|reply
The software version of widevine would be so thoroughly broken in a very short time that it would be bypassed by any one-click downloader addon. Nothing would change for users and YouTube would have the overhead for widevine.

Using the hardware based version could cause a lot of problems with unsupported devices.

[+] londons_explore|2 years ago|reply
I don't think youtube really cares if you pirate their content or use a 3rd party client.

What they care about is you wasting their bandwidth. For an ad-supported video streaming site, bandwidth is normally more expensive than revenue - Google only manages to make it just about work because they have probably the worlds cheapest bandwidth due to being able to bully ISP's into peering with them for free. (they don't let you peer with Google for just Google Search but not youtube).

All these throttling measures are simply trying to reserve most of the bandwidth for real users, not people scraping all the content.

[+] dns_snek|2 years ago|reply
> I wonder what is their reasons for not using it

- Introduces a decryption step, which is slow

- Forces software video decoding, which is slow

- Web browsers only support the weakest form of Widevine which is ineffective

It would effectively push a significant portion of their user base off the platform while not being very effective in its goals.

[+] kelvinjps|2 years ago|reply
I always wondered How YouTube distribute videos l, it's the smoothest video platform, even on when I had crappy internet it worked fine also not all.platform works well in south America.The closest to YouTube it's Netflix but lacks behinds a lot.
[+] 0x7d0|2 years ago|reply
Have you ever tried to download videos from YouTube? I mean manually without relying on software like youtube-dl, yt-dlp or one of “these” websites. It’s much more complicated than you might think.
[+] jgtrosh|2 years ago|reply
Technically, all interesting.

Ethically, if you don't only think “fuck Google”, I feel like it's reasonable to stop after the first optimization (“pass the real browser test to get regular browser speeds”). There you're not “wasting” any more of YouTube's resources than a browser user with ad-block.

Getting full Gb/s without paying anything feels to me like you're pushing all the ad-blocked users' luck.

But then again, fuck Google I guess?

[+] hknmtt|2 years ago|reply
a good read on HN after a very long time, for me.
[+] adhvaryu|2 years ago|reply
I have to agree, it's an interesting topic with a bit of "hacking" masala and just very well written. Can't remember the last time I read a full article here.
[+] swyx|2 years ago|reply
> he most popular one is yt-dlp (a fork of youtube-dl) programmed in Python, but it includes its own custom JavaScript interpreter to transform the n parameter.

ah i remember this one: https://news.ycombinator.com/item?id=32793061

i confess i still dont really understand why they had to make this but i'd love to hear the story behind it

[+] albert_e|2 years ago|reply
Some videos offer multiple audio channels for different languages? Why have I never come across such videos before / missed somehow?
[+] causi|2 years ago|reply
I highly recommend anybody who cares about content on Youtube to download the videos you like and maintain local copies. Material is being deleted or hidden faster than ever and Youtube is only going to get more user-hostile over time.
[+] wodenokoto|2 years ago|reply
Any guides to learn how to do a similar analysis on other websites?
[+] abalashov|2 years ago|reply
Not trying to be a sketchy contrarian, but why would you do this with JavaScript? It just doesn't seem very fit for purpose...
[+] ck2|2 years ago|reply
I mean you don't actually think this will continue working a week after it's widely shared, right?
[+] antiloper|2 years ago|reply
yt-dlp is open source and I'm sure the Google engineers have been aware of it ever since it or it's ancestors were released.
[+] ngc6677|2 years ago|reply
Super cool breakdown, gg!
[+] 1vuio0pswjnm7|2 years ago|reply
"Have you ever tried to download videos from YouTube? I mean manually without relying on software like youtube-dl, yt-dlp or one of "these" websites. It's much more complicated than you might think."

This reminds me of some sort of fizzbuzz test. This is not complicated at all. There is no need to use the Range header or run Javascript.

The short script below does not download anything because there is no need. It does not use Range headers, it does not run Javascript and it makes only one TCP connection. With the JSON it fetches, one can simply extract the videoplayback URLs and put them in a locally-hosted HTML page with no Javascript.

    #!/bin/sh
    # usage: echo videoId | $0 <-- this will indicate len to use    
    # usage: echo videoId | $0 len | openssl s_client -connect www.youtube.com:443 -ign_eof
    # usage: $0 len < videoId-list | openssl s_client -connect www.youtube.com:443 -ign_eof
    
    (
    while read x;do
    test ${#x} -eq 11||continue
    if test $# -ne 1;then len=${#x};x=$(grep -m1 ^\{ $0|sed 's/\$x//'|wc -c);exec echo usage: ${0##*/} $((x+len));fi
    
    cr=$(printf '\r');
    sed "/^[a-zA-Z].*: /s/$/$cr/;s/^$/$cr/" << eof 
    POST /youtubei/v1/player?key=AIzaSyA8eiZmM1FaDVjRy-df2KTyQ_vz_yYM39w HTTP/1.1
    Host: www.youtube.com
    Content-Type: application/json
    Content-Length: $1
    Connection: keep-alive
    
    {"context": {"client": {"clientName": "IOS", "clientVersion": "17.33.2" }}, "videoId": "$x", "params": "CgIQBg==", "playbackContext": {"contentPlaybackContext": {"html5Preference": "HTML5_PREF_WANTS"}}, "contentCheckOk": true, "racyCheckOk": true}
    eof
    done
    printf '\r\n'
    printf 'GET /robots.txt HTTP/1.0\r\nHost: www.youtube.com\r\nConnection: close\r\n\r\n';
    )
    
For processing the JSON I wrote custom utilities in C that (a) extract videoIds and other useful strings, (b) generate HTTP similar to above, and (c) filter the returned JSON into CSV, SQL or HTML. For me, these run faster than Python and jq and are easier to edit. Using these utilities I can also do full searches that return hundreds to thousands of results and I can easily exclude all "suggested" or "recommended" videos.

CSV output

1666520150,23 Oct 2022 10:15:50 UTC,22,aqz-KE-bpKQ,"Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film",00:10:35,635,UCSMOQeBJ2RAnuFungnQOxLg,19211597,"Blender"

SQL output

INSERT INTO t1(ts,utc,itag,vid,title,dur,len,cid,views,author) VALUES(1666520150,'23 Oct 2022 10:15:50 UTC',22,'aqz-KE-bpKQ','Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film','00:10:35',635,'UCSMOQeBJ2RAnuFungnQOxLg',19211597,'Blender') ON CONFLICT(vid) DO UPDATE SET views=excluded.views;

HTML output

Looks just like CSV except vid is a hyperlink

[+] hruzgar|2 years ago|reply
So make a nice, good documented blog post for google engineers to understand and fix this issue?? Whyy