top | item 7046876

Real-time applications and will Django adapt?

121 points| pramodliv1 | 12 years ago |arunrocks.com | reply

69 comments

order
[+] nostrademons|12 years ago|reply
(My background: I work for Google, I did a real-time web prototype using the client libraries for GChat back in 2009 when real-time search was all the rage, my Noogler mentor at Google was the frontend tech lead for the eventual real-time search product we launched, and before Google I'd worked in financial software, where real-time responsiveness really is required.)

I think that the folks currently building prototypes in Meteor dramatically underestimate the difficulty of scaling up real-time software to production-grade quality.

The problem is that if a single component in your stack blocks, you are no longer real-time. Any time one client writes into the database and another reads it, you have to poll, since the DB won't give you notifications. (Exception: PostGres gives you PQnotifies, Oracle gives you the User Messaging Service, MySQL it's theoretically possible with triggers and user-defined stored procedures that make a network call, and MongoDB you can break the DB abstraction and tail the oplog. Good luck plumbing any of these up through your language DB driver and ORM, though.) If you have business logic in a middle-tier server that's request-response only, then that logic becomes a synchronization bottleneck, and you have to constantly update that server and poll it with requests. If your algorithms require complete state snapshots, you're out of luck unless you build a service to manage and update that state consistently while triggering the algorithms whenever it changes. If your algorithms can't run in soft-realtime time guarantees (dozens to hundreds of milliseconds, usually), you're still out of luck. You need to figure out sharding of state and message notifications yourself. You need to figure out message recovery protocols - most real-time systems have odd consistency problems when messages get dropped due to overload, network failures, or software errors.

Google's real-time search ended up polling every 15 seconds with simple AJAX calls, because when the lag for a post to go through the indexing & serving pipeline is a minute or two (itself a major accomplishment), an additional 15 seconds isn't going to be noticeable to the user.

People on HN love to hate on Twitter engineering, but one thing they've done really well is scale a system that actually is soft real-time and has a lot of potential producers and consumers. This is far from the trivial exercise that someone who picked up Meteor in a weekend might think it is.

[+] recuter|12 years ago|reply
I think the twitter-hate was usually because Twitter started with a naive Rails approach on Joyent and would fail whale constantly in the early years.

Just recently they blogged how they finally completed a rewrite that improved per node performance to 100,000 messages per second instead of a few hundred - a better than expected result. Its just a bit OCD inducing how they had to solve problems of their own creation for a long time instead of "doing it right from the start".

I offer no opinion whether or not they were being smart and pragmatic or incompetent hipsters in the beginning.

Edit - Found it, here's the post: https://blog.twitter.com/2013/new-tweets-per-second-record-a...

[+] siliconc0w|12 years ago|reply
Django's event system lets you avoid polling the DB.

Here is a plugin that offers some real-time capability (through a separate process of course): http://telegraphy.readthedocs.org/en/latest/django-telegraph...

Seems to be the best thing on offer that is off the shelf but you may be better rolling your own node, websocket, redis pub/sub, django integration yourself(which actually isn't that hard to do and may give you better flexibility).

[+] sylvinus|12 years ago|reply
From what I understood, Meteor does tail the MongoDB oplog.
[+] joshowens|12 years ago|reply
Scaling anything is hard to achieve, especially when you are also trying to build a business around an idea.

Scaling is usually a problem you solve after you become more popular. My original article about Meteor and Rails wasn't about handling twitter-esque type traffic.

[+] jafaku|12 years ago|reply
How do you know someone works for Google? They will tell you.
[+] programminggeek|12 years ago|reply
There seems to be a meme going around that things like Rails or Django need to somehow change and react to single page javascript web apps.

Maybe it's just me, but trying to modify your favorite web app framework to accommodate something they were never designed to do in the first place is foolish and will end up ruining what was originally great about tools like Django in the first place.

Just because a hammer is a popular tool that you really like doesn't mean it needs to change into ladder when you decide you need to climb onto a roof.

[+] thatthatis|12 years ago|reply
Rails/django are built to build websites. Websites are changing towards being JavaScript in the client single page apps. Thus either django/rails changes or gets removed.

I'm currently architecting a new app, and my django layer is still crucial for: API access to the data, Auth & Auth, background processing.

What we are trying to do as website builders has changed, and thus we are at a turning point. It isnt obvious yet what the go-to stack of the future is going to look like - is it django + tastypie + angular, or rails + ember, or meteor or something else?

Django was great for the old way of doing things (static or Ajax enhanced web). But it's not clear what it's role should be in the future.

To use your analogy: this is people trying to figure out if they still need hammers now that we're starting to use screws as fasteners.

[+] technel|12 years ago|reply
I don't see the value proposition of making (most) web apps/sites real-time. Sure, it makes sense for a chat app or a stock ticker, but blogging? A news site? E-commerce?

Maybe it's important that eBay is "real time" in the last 5 minutes of an auction, but the rest of the time, the vast majority of the content is relatively static. A seller might update the description of a listing a couple times over a two week auction, for example. And while it sounds great to immediately update my search results when a new listing goes live, in reality, I already have 40 pages of results to look through, and that listing that just went live 5 seconds ago probably isn't much more relevant than any of the others I'm sifting through.

I'm not opposed to client-heavy apps where it makes sense. When done well, it can create a really responsive user experience. Gmail is great at this; I have no desire for it to be "real time" -- not any more than it already is.

Do we really believe that one day cnn.com will be "real-time", with article updates and errata popping up inline as we read?

[+] gojomo|12 years ago|reply
It's not that everything must be real-time. But, the stuff that doesn't need it has already been well-done for over a decade. The frontier of new possibilities (including as incremental enhancement to the old categories) tends to involve what's enabled by real-time.

For example, sprinkling in a little real-time surprise – like a notification that others have already responded to your recent work – can accelerate valuable interactions.

For example, in 'blogging' and 'news', both the original authors and active commenters appreciate no-reload indications of fresh comments, mentions, and inlinks. You can do a site without that – but you'll be missing out on features that users increasingly expect, and work to create new interesting content and engagement.

In 'e-commerce', a client-pulled site works and is well-understood, but adding live sales help, or indicators of limited deals being exhausted, can help close sales... so why not try it?

Even where the major cores of these markets work fine without real-time, the frontier of exploration and optimization uses greater game-like liveliness.

[+] alaskamiller|12 years ago|reply
When CNN.com replaces what's gets piped into a 100 inch screen in your living room, yes.
[+] al2o3cr|12 years ago|reply
Maybe it's just me, but I find the simultaneous popularity of "only check your email 4 times a day" and "OMG ALL WEB APPZ MUST BE REALTIME" slightly peculiar.
[+] marcosdumay|12 years ago|reply
To tell you the truth, I don't normaly use real time web apps at all. But I have an urge into turning what I write into real time apps, and no good explanation why, it just feels that they become much easier to use.

Maybe I (and everybody else) only have the wrong impression. It happens, and I don't have enough data to conclude anything.

[+] hayksaakian|12 years ago|reply
idealistic view of the world vs the reality of the world
[+] RussianCow|12 years ago|reply
Python in general doesn't really have a good solution for this, so it's not something specific to Django. I run a Python web app that has certain real-time needs, and I had to forgo a popular web framework like Django so that I could use Twisted. The problem with solutions like this is that since the language doesn't have built-in support for asynchronous IO, everything has to be compatible with the library of your choice (whether that's Twisted, Gevent, or other), and at that point, you'd be better off just using a different language/runtime like Node.js or Erlang.

I think the current solution is to have Django serve the main app and have a separate "API server" that runs Node or whatever, but as the article points out, you're not really even using Django at that point because all it's doing is serving up a single HTML page--the rest is handled by the browser and the API server.

[+] nostrademons|12 years ago|reply
Python 3.4 may help a lot with the language mechanisms (with asyncio, pluggable event loops, and composable generators everywhere), but there's still the issue of getting library support to use all of that.

Node.js isn't actually better - it uses the callback model of async programming, which should be familiar to any C++ programmer who's been writing servers since the 80s, both because it's the current best solution for writing scalable event-driven servers and because it sucks.

For ease of programming a CSP-based language like Go or Erlang is really the way to go, but then you're back to the "lack of library support" problem that you'd get with Python 3.4, except worse because Python at least has libraries for the synchronous part of the computation.

[+] adamauckland|12 years ago|reply
That's currently what I do. Also Django is useful for sketching your requirements out, then once you know what you're building you can move the real-time data retrieval into NodeJS and let Django handle the boring stuff.

Not EVERYTHING in an app will need to be real-time. Especially boring maintenance functionality such as password reset etc.

[+] Kiro|12 years ago|reply
I think very few sites actually need to be SPA at all. Just because an e-commerce site has a real-time component doesn't mean it must be built in Meteor.

E-commerce sites are in fact a prime example of something that I think should be built using traditional technologies. Do you want price updates? Just poll them with AJAX and let the rest of the site remain static. It's far from a multiplayer game we're talking about.

[+] secstate|12 years ago|reply
I don't think I really understand the limitations we're talking about. No you wouldn't ever want to write an app that had real-time elements in pure Django, but isn't that what Celery is for? I bet with a solid messaging queue and good architecture you could write a pretty convincing real-time app using Django as not much more than a REST api to celery tasks and the database (and really, that abstraction is what a framework is for anyway).

Besides, this sky-is-falling nonsense around frameworks is getting old. A framework either lives or dies. Django has a very healthy community around it and they are doing a great job right now of keeping the framework stable so folks who "just need to get work done" can get work done. There haven't been a lot of revolutions, and that's fine for me. Believe it or not, there's still a market for content-heavy, traditional MVC websites. And when you need to add real-time elements, Django, Celery and Django REST Framework are up to the task a vast majority of the time.

[+] sheng|12 years ago|reply
Another real time application issue that rarely gets any attention is WebRTC. I wish people would start tackling these issues for python/django, too. As of writing this I don't know about any library that would allow me to write a server application in python that would serve as a peer in a WebRTC session. The benefit would be unreliable real time data channels to the server. This can be of great use for games. Of course there are many different use cases.
[+] falcolas|12 years ago|reply
Aside from an inability to run websockets on Django, I've been running "real time" websites for quite some time. AJAX calls are dirt simple to handle with your typical Django setup.

Scaling and blocking are handled pretty easily by running Django on FCGI using Flup and a Nginx frontend. No blocking problems since they're running in processes and threads, redis for caching and pub/sub, and a database for the backend. Works a charm.

Now then, this isn't a high volume site, getting only in the medium hundreds of requests per minute, but it's been working without problems on a small AWS instance. DB backups take more CPU than Django ever has.

Websockets, on the other hand, took me over to Go. Certainly not giving up Django for the rest of the site, however, until it really can't handle the load anymore.

[+] rbanffy|12 years ago|reply
Wouldn't a websocket middleware solve the issue? Client starts a socket and passes the id through HTTP to the Django app. When something happens in Django, the event is piped through the previously created socket and the problem is neatly solved. Could even have some sophisticated publish/subscribe mechanics in here.
[+] marcosdumay|12 years ago|reply
I'd completely agree.

Django is missing websockets, and little else. All that ode about not repeating code at the client and the server isn't that relevant because the view (client) operates on a completely different environment from the model (server), and does a completely different kind of data manipulation. Very little code repeats, and the little that does is trivial.

Ok, a better way to represent the client code is always welcome, but Django has already a lot of power and flexibility here, and it couples well with Javascript capabilities.

[+] rartichoke|12 years ago|reply
I don't know about Django but rails has the idea of "live controllers".

Sure it uses polling but didn't you watch DHH's railsconf presentation? They have 5-6 workers and a single redis server which sustains 100k+ reqs per minute.

It also only took DHH 4 hours to convert the entire basecamp project to be live (ie. live updating comments as it comes in).

Sure it's not really live since the polling is only happening every few seconds but who cares? Even for most chat systems it's completely reasonable to do polling, most certainly if it's 1:1 chat.

Also look at Disqus. They are mostly all django, they even use postgres with a schema. Their "real time comment system pusher" was written in Go in a week with almost no prior knowledge to Go. I see nothing wrong with that and IMO it's exactly what we should be doing.

Use Django/Rails for the bulk of your app, CRUD interfaces, etc. and then create optimized services with Go or some other language for real-time aspects.

[*] Everything I mentioned is documented online through talks, engineering blogs, etc..

[+] bayesianhorse|12 years ago|reply
I have written "realtime" web applications in Django, using Tornado for websockets (or their emulations). While a pure realtime non-blocking solution might be able to squeeze out a lot of more performance, it's certainly possible.

Realtime web applications require a choreography of communication between server and client, with an unpredictable user and network messing stuff up all the time. Like much of web development it comes down to not going crazy. Otherwise we would be writing web applications in C++ or Java, wouldn't we?

[+] FZambia|12 years ago|reply
I don't see nothing wrong with separate asynchronous server which handles real-time for your Django site.

When event generated by user happens on your site - you just handle it in a traditional manner i.e. - POST via AJAX, validate, save if necessary and then publish into asynchronous server which broadcasts event to all connected clients. In this way you have a graceful fallback in case of async server downtime, so your user doesn't even notice something went wrong. You are not mixing things which were not developed to be mixed. In this case you are just writing your site as usual and then add real-time elements where necessary.

Using Gevent together with Django seems like monkey patching entire web site to me.

I really respect the work of guys developing uWSGI. But at moment it does not seem to be usable in a simple obvious way. Maybe in future their real-time support will become mature and convenient enough.

Of course, Meteor and Derby like approach is another level of problem solution. But in context of Django I don't think we should consider them as examples. We use python, not javascript - we have no native solution for browser environment and I personally think we do not even need it.

[+] Choronzon|12 years ago|reply
The best way I found to do this for python is by using Tornado,you have an excellent websocket implementation baked in and a scheduler within the webserver itself so its simple to poll for changes and update only when necessary,or interleave with a call back if you want "true" real time. Plug in a front end with angular/knockout etc,pass around json objects and you are good.

As far as meteor/node goes,having the same language on the server client is great. Having javascript as that language is not so great. Web apps are generally a front end to something bigger and I never want to do any serious data wrangling in javascript if I can avoid it.

[+] jkarneges|12 years ago|reply
You can use Pushpin in front of Django (or any web framework, whether event-driven or not) to implement realtime features.

http://blog.fanout.io/2013/04/09/an-http-reverse-proxy-for-r...

The thesis behind this architecture is that most realtime web applications can be reduced to request/response and publish/subscribe messaging patterns. Instead of looking at Django as a legacy framework, look at it as 50% of the solution (read: request/response). Pushpin provides the rest.

[+] leephillips|12 years ago|reply
This looks quite interesting; thanks for sharing.
[+] clubhi|12 years ago|reply
I disagree that server side templates are no longer needed. Templates are often reused for things like sending emails or exporting to PDF. Sure, you could use a JavaScript server side template to do this.
[+] glynjackson|12 years ago|reply
The only issue I see with Django is websockets. Apart from that I have been using Django to build 'real time' web apps for years (AJAX). Django does server side very well, AngularJS does client site well, mix in django-angular and I have most of what I need. websockets django-websocket-redis.
[+] d0m|12 years ago|reply
I had the same problems.. I love Django and I want to use it for my real-time application but I just couldn't find a way to make it work. I've chosen to use node/angular/firebase instead and I'm very happy with my choice so far.
[+] scardine|12 years ago|reply
I use Angular with Django-Rest-Framework and I love it.
[+] dobbsbob|12 years ago|reply
localbitcoins.com is using django. Start a trade and messages are real time without needing to reload a page
[+] mcantelon|12 years ago|reply
Probably ajax, which is more resource intensive/laggy than WebSocket-y and isn't bi-directional (you've got to poll with ajax if you want to push changes from the server).