Tornado: FriendFeed's non-blocking Python web server is now open source

[+] tdavis|16 years ago|reply

This is more of a combination of a web server and a web framework, which is what I find fascinating about it. Twisted has had a web server forever, but despite a significant amount of experience in Twisted Land, I'd never write a full app using it! Add to that the (apparently) standalone low-level modules and we've got some seriously awesome tech that FriendFeed/Facebook have supplied for free.

Thanks a lot, fellas. Asynchronous networking programming in Python is somewhat of a bear so it's really nice to see a tool get released that improves it!

[+] casey|16 years ago|reply

An instance of the chat demo is running here: http://chan.friendfeed.com:8888/

Website: http://www.tornadoweb.org/

Source: http://github.com/facebook/tornado/tree/master

[+] far33d|16 years ago|reply

The chat demo seems to have turned into a tech discussion about the framework. Love it.

[+] n8agrin|16 years ago|reply

Awesome to see more Python webservers coming to light, and fast ones at that. One observation though; anyone else notice that the bar chart is a bit misleading? It compares running Tornado in 4 processes behind nginx to running Django behind Apache and Cherrypy as a single Python process. As a fellow coworker put it, "With 4 extra processes running of course it will be 4 times as fast." I'm not opposed to this kind of comparison when the intent is to show how configurations can increase req/sec, but if the intent is to compare one Pythonic webserver to another, this comparison seems a bit unfair. That said, assuming the "Tornado (1 single-threaded frontend)" measure is simply Tornado running as a single Python process, it still is plenty faster than Cherrypy.

[+] mdasen|16 years ago|reply

Apache/mod_wsgi creates multiple processes (or can). The issue is that Facebook didn't tell us what configuration settings they used. Did they limit the number of processes? Did they not have it start up enough at the start?

It's clear that Facebook wasn't using a single Apache/mod_wsgi process because that wouldn't get close to 2,000 requests per second. I'm sure they gave it the number of processes that were appropriate for the amount of RAM on the box - the issue is that every Apache process (of which I'm sure they dozens if not hundreds) uses up memory.

It's hard to set that up properly and that's what makes benchmarking so hard. Still, it's not hard to believe that with the weight of Apache in the process, there's going to be at least 5MB of overhead for every request. If you were saving that 5MB of RAM per client times the 2,223 requests per second, that's 11GB worth of freed RAM per second. Let's say Django uses 10MB of RAM for a request: that's an extra 50% increase in capacity right there just by eliminating the 5MB Apache process overhead. And Apache tends to be on the heavier side (5MB is somewhat low as an estimate) and if you can do it asynchronously, you're not tying up RAM as you wait for clients, so you do well.

Now, put Apache/mod_wsgi behind nginx (as a proxy) and you'll also see an increase in requests per second because you won't be tying up all that RAM as a client is downloading and only use the RAM while actually doing work in Python freeing that memory to be used elsewhere. However, I'm guessing that they ran this benchmark locally and so the network tying up Apache instances wasn't a factor.

Clearly, benchmarks are benchmarks, but it isn't as if Facebook and FriendFeed don't have lots of smart engineers and they wouldn't be using their own server if Apache was significantly better. They have a site of massive scale and know how to benchmark things since you need objective data to make decisions based on rather than the religious arguments some get into.

I'd guess that, if one could get Django running under Tornado, it would run similarly and that it's more of Apache being the bottleneck/RAM hog.

[+] finiteloop|16 years ago|reply

CherryPy is multi-threaded, so I am not sure what you are saying is correct. There are two types of servers: multi-threaded, multi-process. Tornado is multi-process. If your server is multi-threaded, it uses all of the cores without additional processes.

CherryPy did max out the CPU on all of the cores in the load test, so I think it was a fair test.

[+] finiteloop|16 years ago|reply

Likewise, 4 extra processes does not mean 4 times as fast if you only have one or two cores on your machine. Check out the details at http://www.tornadoweb.org/documentation#performance for how we ran the test.

[+] n8agrin|16 years ago|reply

The templating system they developed looks pretty sweet http://github.com/facebook/tornado/blob/master/tornado/templ....

Just played around with it a bit and it's basically what I want:

1) Simple and clear syntax (e.g. they use 'end' not endfor, endblock, etc)

2) Assign template variables to anything (including functions)

3) Don't over-restrict the author (e.g. they allow list comprehensions in if tags)

4) Block and extends statements.

Error handling seems to be a little clearer than other template languages though still not great:

http://gist.github.com/184934

Clearly it's not going to be for everyone but after slogging around with Mako for the last year (<%namespace:def /> tags anyone?) this is like a breath of fresh air.

[+] finiteloop|16 years ago|reply

Thanks! Yah, the reason we rolled our own is because all the rest were so restrictive to authors. We didn't want a template system telling us what we should and shouldn't do in templates, so ours was a very thin layer on top of what basically translates directly to Python. It is actually one of my favorite parts of the system.

[+] RoboTeddy|16 years ago|reply

It seems like blocking calls to data sources (the database, memcache, etc) would screw with a lot of the benefits of running the framework asynchronously. If your Tornado processes freeze while making synchronous backend requests, you're gonna need a lot of them, which probably kills a lot of the value.

Now all we need are async data clients that tie into the Tornado event loop, and a clean way to yield control back and forth (perhaps with coroutines - http://www.python.org/dev/peps/pep-0342/). Unfortunately, I don't think there are any async Python mysql clients (PHP has one now - http://www.scribd.com/doc/7588165/mysqlnd-Asynchronous-Queri...)

A web app written like that would run like it's on fire. The event loop would just roll through everything asynchronously. You could even automatically batch together data calls across http requests to make things easier on the data tier.

[+] DenisM|16 years ago|reply

The browser-webserver delay resulting from long polling is much larger the frontend-backend delay. I think there is at least an order of magnitude difference there, hence an order of magnitude reduction is the amount of idle state kept on the server.

[+] extension|16 years ago|reply

If you go with an evented architecture like this, you have to use it for pretty much all I/O or the whole thing falls apart. Lack of drivers and protocols has been a large barrier to adoption.

[+] mapleoin|16 years ago|reply

Who wants a (unofficial for now) Fedora rpm ? [0] I just hacked this together in half an hour. Time to sleep now.

[0] http://mapleoin.fedorapeople.org/pkgs/tornado/tornado-0.1-1....

[+] cheriot|16 years ago|reply

Definitely rough around the edges, but looks like a great foundation.

A few notes: - template.py: "Error-reporting is currently... uh, interesting." - url mapping looks primitive - no form helpers - database.py looks decent, but is mysql only - Very nice to see the security considerations in signed cookies, auth, csrf protection.

Overall, it looks like they've done the trickier parts of building a web framework and left it to the user/community to add the parts web developers use most frequently (form & url helpers, code organization, and orm).

I love Facebook's enlightened approach to open source. If only one of their open source projects takes off, it will benefit them tremendously.

[+] xal|16 years ago|reply

I implemented the same idea in Ruby event machine: http://gist.github.com/184760

Obviously missing features (auth, nicks, scrolling) but that can all be added in a few mins.

[+] tiredandempty|16 years ago|reply

how about turning this to a full framework? or is there one already?

[+] usaar333|16 years ago|reply

What exactly does 'non-blocking' mean in the context of a webserver?

[+] cheriot|16 years ago|reply

Holding the TCP connection open does not tie up all the resources of the request handling thread. That way a large number of inactive connections can stay open.

[+] acangiano|16 years ago|reply

Non-blocking sockets. Calls to socket.read() can return no data. It's non-blocking because the program is not stuck waiting for data to be returned.

[+] known|16 years ago|reply

http://www.kegel.com/c10k.html

[+] unknown|16 years ago|reply

[deleted]

[+] ComputerGuru|16 years ago|reply

I've been looking for a good framework to begin serious web development on for someone coming from a decade-long career in desktop development (see the discussion at http://news.ycombinator.com/item?id=808191), and decided this release co-incides nicely with my newly-started quest and gave it a shot.

I realize Tornado isn't any different from the other Python frameworks with regards to coding style, etc. but the the fact that this framework comes complete with a web server means I don't have to worry about that part of the equation making developing Python-based webapps almost identical to developing a C++ library with one of the many HTML-based UI frontends :)

Having a great time playing around with it... almost done with a basic forum system built on Tornado + Storm (https://storm.canonical.com/).. I think I'm getting the hang of this whole web-development thing! :D

[+] apoirier|16 years ago|reply

For a real different coding style, more like a desktop development, take a look at Nagare (http://www.nagare.org), a continuation and components based web framework. And it also comes with an integrated HTTP server (and a fastCGI one) ;)

[+] calaniz|16 years ago|reply

This looks great. I'm likely to look at it a little more in the future. I've been using Second Life's asyncrounous coroutine library called eventlet. I've implemented most of my code with a good lightweight framework that fits; restish. On top of that, I use spawning to manage my server processes. I myself have seen these kinds of numbers with my own app tests.

[+] omouse|16 years ago|reply

Does this implement all of the HTTP 1.0 and 1.1 protocol specs? I'm kinda confused by the code :S

[+] finiteloop|16 years ago|reply

It implements a lot of HTTP/1.1, but see http://www.tornadoweb.org/documentation#caveats-and-support. In practice, we run behind an nginx reverse proxy, so we assume there are missing areas. We recommend people run in a similar fashion in production. We did not optimize for protocol completeness given our production setup.

[+] drawkbox|16 years ago|reply

This is great for python, a proven web framework that was stressed on friend feed. Thanks ff team! I have been torn on my next python server project between web2py, cherrypy and django and I think I just decided.

[+] mdipierro|16 years ago|reply

For the record. They tested web.py not web2py. It is not the same thing and they are completely unrelated.

[+] polvi|16 years ago|reply

This is great. Any idea if they will do the same for the FriendFeed datastore?

[+] liuliu|16 years ago|reply

As I recalled, FriendFeed uses MySQL as the "datastore" backend, but with very different table layout. Check out this article: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

[+] Raphael|16 years ago|reply

It's just MySQL.

[+] dlsspy|16 years ago|reply

I could use this in apps today if it used twisted instead of reinventing it.

[+] finiteloop|16 years ago|reply

How many deployed web apps really use Twisted? There are like 3 web packages in Twisted, most of which are really buggy, and as far as I could tell, barely used (even they acknowledge this, see http://twistedmatrix.com/trac/wiki/WebDevelopmentWithTwisted).

When we were developing this, we found that Twisted introduced as many problems as it solved in terms of incomplete features and bugs. The other protocols seem to have more attention than HTTP from what I could tell.

[+] statictype|16 years ago|reply

What would be the advantage of using this over, say, lighttpd with one of the many existing python frameworks?

Does this web server offer any specific gains? Or is it just a question of personal taste?

[+] auston|16 years ago|reply

At a first glance, this looks SERIOUSLY awesome!

[+] prakash|16 years ago|reply

Bret, quick question. Does the framework support an esi type tags to abstract out personalized info? thanks!

[+] finiteloop|16 years ago|reply

I am not sure what esi type tags are. Mind sending me a link?

Suffices to say, no, we do not support that :)

[+] charlesju|16 years ago|reply

Is this similar to EventMachine in Ruby?

[+] fizx|16 years ago|reply

EM is a general-purpose library for nonblocking IO. Tornado is an HTTP server. In as much as they're both about nonblocking IO, they are similar.

73 comments