top | item 8648436

(no title)

rg3 | 11 years ago

It's very nice to see a project I started reach the front page of HN.

I remember starting the project around 2006. Back then, I had a dial-up connection and it wasn't easy for me to watch a video I liked a second time. It took ages. There were Greasemonkey scripts for Firefox that weren't working when I tried them, so I decided to start a new project in Python, using the standard urllib2. I made it command line because I thought it was a better approach for batch downloads and I had no experience writing GUI applications (and I still don't have much).

The first version was a pretty simple script that read the webpages and extracted the video URL from them. No objects or functions, just the straight work. I adapted the code for a few other websites and started adding some more features, giving birth to metacafe-dl and other projects.

The raise in popularity came in 2008, when Joe Barr (RIP) wrote an article about it for Linux.com.[1] It suddenly became much more popular and people started to request more features and support for many more sites.

So in 2008 the program was rewritten from scratch with support multiple video sites in mind, using a simple design (with some defects that I regret, but hey it works anyway!) that more or less survives until now. Naturally, I didn't change the name of the program. It would lose the bit of popularity it had. I should have named it something else from the start, but I didn't expect it to be so popular. One of these days we're going to be sued for trademark infringement.

In 2011 I stepped down as the maintainer due to lack of time, and the project is since then maintained by the amazing youtube-dl team which I always take an opportunity to thank for their great work.[2] The way I did this is simply by giving push access to my repository in Github. It's the best thing I did for the project bar none. Philipp Hagemeister[3] has been the head of the maintainers since then, but the second contributor, for example, was Filippo Valsorda[4], of Heartbleed tester[5] fame and now working for Cloudflare.

[1] http://archive09.linux.com/articles/114161 [2] http://rg3.name/201408141628.html [3] https://github.com/phihag [4] https://github.com/filosottile [5] https://filippo.io/Heartbleed/

discuss

dredmorbius|11 years ago

Another thanks here. I use youtube-dl so much that I occasionally substitute it for wget when trying to fetch online content (and occasionally discover in the process that it will in fact grab what it was that I was trying to get in the first place -- mostly audio files).

I vastly prefer offline media players to browser-based tools, for a number of reasons: better controls and playback, richer features, uniform features (I don't have to learn each individual site's idiosyncracies), the ability to queue up a set of media from numerous sources and play them back without clobbering one another, and more.

Hugely useful tool, and I've been impressed as hell as well by its update frequency.

And lift a mug to old Warthog. I miss Joe as well.

shutupalready|11 years ago

> I vastly prefer offline media players to browser-based tools

Why don't browsers provide some way to play local video files, for example by typing "file:///c:/my_video.flv" into the address bar. After all, the browser certainly includes the ability to play the video being downloaded off the web.

If you try "file:///c:/my_video.flv" with Firefox, it opens a dialog box offering to pass the video file to whatever external media players you have installed.

In what seems inconsistent to me, "file:///c:/my_notes.txt" and "file:///c:/my_pic.jpg" will be rendered correctly by Firefox -- it won't offer to open an external text editor or photo viewer. Why is video different?

FiloSottile|11 years ago

Filippo here. I'm sad that I didn't have the time to contribute to ytdl much recently, but it was my first time playing a role in a big and popular project, and I'm terribly grateful for that trust. (Thanks Ricardo, thanks Philipp!)

Also, what always impressed me is the incredible amount of random contributions from the community. Ever since we introduced a super-simple plugin system [0], support for the most disparate video sites poured in as PR. (>800 PR!!) Also, given how ytdl is structured, the most simple plugin gets you 90% of the tool power for that video sit. Big results with minimum effort.

Finally, to answer the question about the updates in some siblings, there is no active effort against us most of the time (VEVO videos being the notable exception) but supporting such a number of sites mainly by scraping means that breaking changes happen really really often.

[0] https://filippo.io/add-support-for-a-new-video-site-to-youtu...

phihag_|11 years ago

Thank you for all your contributions! Can you update that article to use video_id = self._match_id(url) and _VALID_URL = 'https?://...' though? We've also added a fair bit of "official" documentation at https://github.com/rg3/youtube-dl/blob/master/README.md#addi... .

tripzilch|11 years ago

Love your tool. Thanks, like others say I use it at least a few times every week. I just prefer watching longer videos in VLC, easier seeking and no buffering, etc. Also much snappier than Youtube on my old laptop.

hackmiester|11 years ago

This won't solve the buffering problem, but for the record, you can use "open network location" in VLC to watch YouTube videos directly.

Regardless, youtube-dl and VLC complement each other quite well.

rational-future|11 years ago

Great job rg3, thank you very much for your work!

Do you think DRM support in browsers and major tube sites will soon prevent tools like youtube-dl from functioning?

rg3|11 years ago

Well, this is going to be more of a philosophical answer than a technical one: with the popularity of YouTube nowadays, which is available on every platform and allows for anyone, anywhere to instantly watch a video, the cat-and-mouse DRM game would not succeed. I think DRM is flawed (insert the typical lock-and-key analogy here) but it does work for some situations. For YouTube: probably not. Somewhere, someone talented would crack it and tools like youtube-dl would continue to exist. A recent example is youtube-dl using rtmpdump when available to download DRMed videos.

amelius|11 years ago

I use this tool a couple of times every week. I love this tool!

And I must say I'm impressed by its ease of use (basically zero installation effort), and also by the frequent updates.

(I wonder why those frequent updates are necessary, though. Are you under the impression that google is actively working against tools which attempt to download material from youtube?)

phihag_|11 years ago

Hi, I'm the current lead developer. We update extremely frequently because our release model is different from other software; there is usually little fear of regressions (fingers crossed), and lots of tiny features (i.e. small fixes or support for new sites) that are immediately useful for our users. We've had the experience that almost all users prefer it that way, so we try to enable every reporter to get the newest version by simply updating instead of having to check out the git repository.

As @fillipo said above, there is little if any pushback from video sites. Most of the time, they update their interface (we've gotten better in anticipating minor changes) and something breaks. The recent string of YouTube breaks (for some videos, mostly music videos - general video is unaffected) is caused by the complexity of their new player system, which forces us to behave more and more like a full-fledged webbrowser. But I think we usually manage to get out a fix and a new release within a couple of hours, so after a small youtube-dl -U (Caveats do apply[0]) you should be all set again. Sorry!

[0] https://yt-dl.org/update

rg3|11 years ago

The current team should have more information, but I think most updates are due to other sites breaking and new sites being added that due to YouTube or bug fixes. I don't think YouTube is actively working against tools like youtube-dl, at all.

spindritf|11 years ago

The frequent (almost daily, sometimes more frequent than daily) updates are really impressive.

rg3|11 years ago

I agree. It's a consequence of the software being very volatile, having thousands of users and supporting so many sites. There's something to fix or to add every day.

GhotiFish|11 years ago

This tool ended up being so useful, it made the internet substantially better for me.

As the catalyst and original dev for this tool, Thank you!

mitchty|11 years ago

youtube-dl lets me avoid the awful youtube ui.

I setup a makefile to let me just go make and then eventually vlc pops up with stuff to watch every so often. Its quite nice.

https://github.com/mitchty/youtubes/blob/master/Makefile

mzs|11 years ago

Thanks so much as well, I use FreeBSD and it has been the only way for me to watch youtube video consistently.

101914|11 years ago

Remarkably, YouTube makes scripting downloads very easy. The script below needs only sed and some http client and it has worked for years. I have only had to change it once when there was a change at YouTube; the change was very small.

   # this script uses sh, sed, awk, tr and some http client
   # here, some http client = tnftp
   # awk and tr are optional
   
   
   # wrapper for tnftp to accept urls from stdin
   ftp1(){
   while read a;do 
   ftp ${@--4vdo-} "$a" 
   done;}
   
   
   # uniq
   awk1(){ awk '!($0 in a){a[$0];print}' ;}
   
   
   # some url decoding
   f1(){
   sed '
   s,%3D,=,g;
   s,%3A,:,g;
   s,%2F,/,g;
   s,%3F,?,g;
   s/^M      
   //g;
   #  ^ thats Ctrl-V then Ctrl-M in vi   
   ' 
   }
   
   # remove redundant itags
   f0(){
   sed -e '
   s/&itag=5//;t1
   s/&itag=1[78]//;t1
   s/&itag=22//;t1
   s/&itag=3[4-8]//;t1
   s/&itag=4[3-6]//;t1
   s/&itag=1[346][0-9]//;t1
   ' -e :1
   }
   
   # separate urls 
   f2(){
   sed '
   s,http,\
   &,g' 
   }
   
   # remove unneeded lines
   f3(){
   sed '
   #/^http%3A%2F.*c.youtube.com/!d;
   /^http%3A%2F.*googlevideo.com/!d;
   /crossdomain.xml/d;
   s/%25/%/g;
   s,sig=,\&signature=,;
   s,\\u0026,\&,g;
   /&author=.*/d;
   ' 
   }
   
   
   
   # separate cgi arguments for debugging
   f4(){
   sed '
   s,%26,\
   ,g;
   s,&,\
   ,g;
   ' 
   }
   
   # remove more unneeded lines
   f5(){
   sed '
   /./!d;
   /quality=/d;
   /type=/d;
   /fallback_host=/d;
   /url=/d;
   /^http:/!s/^/\&/
   /^[^h].*:/d;
   /^http:.*doubleclick.net/d;
   /itag.*,/d;
   '
   }
   
   # print urls 
   f6(){
   sed 's/^http:/\
   &/' | tr -d '\012' \
   |sed '
   s/http:/\
   &/g;
   ' 
   }
   
   f8(){
   sed 's/https:/http:/'
   }
   
   FTPUSERAGENT="like OSX"
   
   case $# in
   0) 
   echo|$0 -h 
    ;;
   [12345])
   case $1 in
   
   -h|--h)
   echo "url=http[s]://www.youtube.com/watch?v=..........."
   echo usage1: echo url\|$0 -F \(get itag-no\'s\)
   echo usage2: echo url\|$0 -g \(get download urls\)
   echo usage3: echo url\|$0 -fitag-no -4o video-file
   echo N.B. no space permitted after -f
   
    ;;
   -F)
   $0 -g \
   |tr '&' '\012' \
   |sed '
   /,/d;
   /itag=[0-9]/!d;
   s/itag=//;
   /^17$/s/$/ 3GP/;
   /^36$/s/$/ 3GP/;
   /^[56]$/s/$/ FLV/;
   /^3[45]$/s/$/ FLV/;
   /^18$/s/$/ MP4/;
   /^22$/s/$/ MP4/;
   /^3[78]$/s/$/ MP4/;
   /^8[2-5]$/s/$/ MP4/;
   s/.*?//;
   '|awk1
    ;;

   -g)
   while read a;do
   n=1
   while [ $n -le 10 ];do
   echo $a|f8|ftp1||
   echo $a|f8|ftp1 &&
   break
   n=$((n+1))
   done \
   |f2|f3|f1|f0|f4|f5|f6|f1|sed '/itag='"$2"'/!d'
   done
    ;;

   -f*)
   while read a;do
   n=1
   while [ $n -le 10 ];do
   echo $a|$0 -g ${1#-f} |ftp1 $2 $3 $4 $5 ||
   echo $a|$0 -g ${1#-f} |ftp1 $2 $3 $4 $5  && 
   break
   n=$((n+1))
   done
   done
    ;;

   esac
   esac

There are separate scripts for extracting www.youtube.com/watch?v=........... urls from web pages to feed to this script.

phihag_|11 years ago

The problem is that this only works for some YouTube videos (for example it will fail for basically all VEVO videos), not to mention maintainability issues.

philtar|11 years ago

Pastebin next time, please.

philtar|11 years ago

[deleted]