top | item 4359684

Scaling node.js to 100k concurrent connections

64 points| dworrad | 13 years ago |blog.caustik.com

63 comments

order
[+] ericz|13 years ago|reply
Before everyone gets excited about these big numbers, I would like to remind you that even higher concurrency can be achieved with even lower CPU and memory usage using Erlang. These numbers are good for Node, but don't use this as evidence that Node is magical and much better at handling large numbers of connections than other systems.
[+] jimparkins|13 years ago|reply
People are excited by Node doing numbers like this because there is a massive active Javascript community - with hundreds of thousands of people using Javascript all day every day at work. 0.01% of these people would ever consider learning Erlang, and even if they did they would not be able to use it at work - ever. As with everything having better features means nothing if nobody adopts. I am not saying nobody uses Erlang and I am not saying people are not adopting it - but the number are just not comparable to the Javascript community . Lastly just because you know a but of Javascript I realise that this does not mean you can architect massive real time systems. But it is like WOW even people that play casually aspire to having the best kit or playing for a top guild.
[+] scarmig|13 years ago|reply
Interesting example from 2008:

http://www.metabrew.com/article/a-million-user-comet-applica...

I'd also add that this shouldn't be taken as something to say that Erlang is totally superior to node or that Erlang makes scaling to 1M concurrent connections a piece of cake. If you're working at that level, there's no magical out of the box solution.

[+] ebiester|13 years ago|reply
How fast can I get a median programmer to learn Erlang, learn the libraries, and be productive enough to be able to make these high concurrency apps?

Let's say they are a full stack programmer who knows some html, some css, some javascript, some java, and some sql.

I have a pretty good idea how fast I can bring someone up to speed on node.js -- I have to teach them some advanced JS concepts, some node.js conventions, and the APIs of my library. Async takes a little bit to wrap your head around, but it's not terrible.

Node.js seems like it is on the way to "worse is better."

[+] ricardobeat|13 years ago|reply
The magic is in javascript. How many people can write Erlang?
[+] pron|13 years ago|reply
... or Java.
[+] est|13 years ago|reply
what about other languages?
[+] forgotAgain|13 years ago|reply
Garbage collection is disabled. How is this then relevant to any real world usage?
[+] sootzoo|13 years ago|reply
He's not running with GC permanently disabled, he's only disabled the automatic GC because of the huge overhead required (claiming 1-second pauses every few seconds). He also mentions it's trivial to enable manual GC and run that via setInterval/setTimeout/what-have-you.
[+] gaius|13 years ago|reply
Isn't this really scaling the underlying C runtime to 100k connections?
[+] babuskov|13 years ago|reply
I use Node in production. The main thing I like about it is that looking at system usage graphs while number of users grow, only thing that is going UP is bandwidth ;)

I'd really like to see a story of someone really having 100k connected browsers. My online game currently peaks at about 1000 concurrent connections, and node process rarely lasts longer than 2 hours before it crashes. Of course, using a db like Redis to keep users sessions makes the problem almost invisible to users, as restart is instantaneous. I'm using socket.io, express, crypto module, etc.

I'd really like to see real figures for node process uptime from someone having 5000+ concurrent connections.

[+] giulianob|13 years ago|reply
I'm using C# for my game Tribal Hero (www.tribalhero.com). It's still in early beta so I've only had 450 concurrent users . Our CPU usage and memory usage barely moved from 0 to 450 users. We're using socket selects and not even async sockets which would have even better performance. It's also backed by MySQL though we want to eventually move to Redis. Why is Node breaking at 1k connections? Doesn't seem like much at all.
[+] benologist|13 years ago|reply
I do about 300,000 - 500,000 concurrent connections on nodejs but it's all short lived web requests.

It took a while to iron out most cases that can crash, right now I have:

web.1: up for 12h

web.2: up for 12h

web.3: up for 12h

web.4: up for 12h

web.5: up for 12h

web.6: up for 4h

web.7: up for 1h

web.8: up for 12h

web.9: up for 12h

web.10: up for 12h

web.11: up for 32m

web.12: up for 12h

web.13: up for 7h

web.14: up for 7h

[+] poundy|13 years ago|reply
I could never get my socket.io instance to max out, is there a good way to load test socket.io and web sockets?
[+] antihero|13 years ago|reply
Can uwsgi/nginx be configured similarly?

Is it common practise to have node face the web without nginx?

[+] devmach|13 years ago|reply
It's a shame, that he didn't mentioned about kernel tuning. Without custom settings ( like net.ipv4.tcp_mem ), i think, it's a very difficult to reach this numbers.
[+] nivertech|13 years ago|reply
I did 3M/node on physical severs, 800K/node on EC2 instances.

We mostly use Erlang on server-side and node.js + CoffeScript on client-side (where they rightfully belong ;)

[+] nicolast|13 years ago|reply
It struck me the author runs his apps as root (in screenshots). But then I remembered he's using node.js to handle "thousands of concurrent connections".
[+] xentronium|13 years ago|reply
I think it's his testing machine, so it's shoot & forget setup.
[+] dotborg2|13 years ago|reply
Looks like author is not aware of some cuncurrency problems, deadlocks etc. Backend/database might not scale to 100k concurrent connections so easily.
[+] benologist|13 years ago|reply
This is where NodeJS really starts to shine - persistant connections and background operations let you do a whole bunch of cool stuff to mitigate that.

In my case I have entire db tables and collections replicated in memory and kept in sync via redis pubsub, and the 100,000s of concurrent users I have are all sharing just a few dozen persistant redis and mongodb connections between them.

[+] darkarmani|13 years ago|reply
Scaling the backend is a lot easier than dealing with concurrent front-end connections!
[+] ericmoritz|13 years ago|reply
I would really love to know what he did to tune that Rackspace VM. I had a terrible time trying to get node.js and others to get past 5,000 concurrent websocket connections on a m1.large EC2 instance or on Rackspace.
[+] mariuz|13 years ago|reply
I wonder what happens at 100k database connections , i will give a try with firebird and the nodejs driver
[+] bradleyland|13 years ago|reply
That's the thing about these types of benchmarks. They're useful for showing that node has the throughput -- at a low level -- to serve a huge number of concurrent connections, but it doesn't translate directly to huge application throughput if you're relying on things like database access over a network. In practice, each of these problems must be solved individually.

I don't mean to minimize this accomplishment. If you're assuming you need 100k database connections in order to scale, you might be solving the wrong problem. Scaling is a matter of moving data as close to the CPU as possible. This means in-memory caching is where real performance comes in. I don't care how good your language/framework is, you can't defeat the physics of slow I/O over a network.

[+] ricardobeat|13 years ago|reply
You would be using a connection pool instead of opening one for every client.
[+] babuskov|13 years ago|reply
I have 500-600 node connections using a single DB connection and it works fine. It's MySQL using binary driver though.
[+] bluesmoon|13 years ago|reply
i remember seeing this on HN back in April