Node benchmark is flawed though. Add something like
require('http').globalAgent.maxSockets = 64;
at the top of node script if you want a fair comparison with async php version. The bottleneck is bandwidth here. Not the runtime.
On my laptop, original script from the author took 35 seconds to complete.
With maxAgents = 64, it took 10 seconds.
Edit: And who is downvoting this? I just provided actual numbers and a way to reproduce them. If you don't like how the universe works, don't take it out on me.
Nobody should be downvoting you, you raise an excellent point and you are of course right.
NodeJS v0.10.21 + Cheerio
real 0m47.986s
user 0m7.252s
sys 0m1.080s
NodeJS v0.10.21 + Cheerio + 64 connections
real 0m14.475s
user 0m8.853s
sys 0m1.696s
PHP 5.5.5 + ReactPHP + phpQuery
real 0m15.989s
user 0m11.125s
sys 0m1.668s
Considerably quicker! As I said I was sure NodeJS could go faster, but the point of the article was that PHP itself is not just magically 4 times slower, it is in fact almost identical when you use almost identical approaches. :)
> Update: A few people have mentioned that Node by default will use maxConnections of 5, but setting it higher would make NodeJS run much quicker. As I said, im sure NodeJS could go faster - I would never make assumptions about something I don't know much about - and the numbers reflect that suggestions. Removing the blocking PHP approach (because obviously it's slow as shit) and running just the other three scripts looks like this:
If we really want to get into benchmarks LuaJIT with multithreading is almost 2x faster then both, it took 21 seconds to complete on my computer. And I'm willing to bet that multithreaded C would be even faster.
However you want to know what this benchmark proves? Absolutely nothing as it has to query a website. So the response time of the website matters more then this test.
Long term php guy (I maintained APC for years, slowly given up now), so I've worked a lot with ~2k/3k request-per-second PHP websites.
The real trick here is async processing. A lot of the slow bits of PHP code is people not writing async data patterns.
If you use synchronous calls in PHP - mc::get or mysql or curl calls, then PHP absolutely sucks in performance.
Nodejs automatically trains you around this with a massive use of callbacks for everything. That is the canonical way to do things - while in PHP blocking single-threaded calls is what everyone uses.
The most satifying way to actually get PHP to perform well is to use async PHP with a Future result implementation. To be able to do a get() on a future result was the only sane way to mix async data flows with PHP.
For instance, I had a curl implementation which fetched multiple http requests in parallel and essentially lets the UI wait for each webservices call at the html block where it was needed.
There was a similar Memcache async implementation, particularly for the cache writebacks (memcache NOREPLY). Memcache multi-get calls to batch together key fetches and so on.
The real issue is that this is engineering work on top of the language instead of being built into the "one true way".
So often, I would have to dig in and rewrite massive chunks of PHP code to hide latencies and get near the absolute packet limits of the machines - getting closer to the ~3500 to 4000 requests per-second on a 16 core machine (sigh, all of that might be dead & bit-rotting now).
I get sick of these language wars, especially the constant stream of PHP ridicule that just never seems to end. The positives I try to take away from all of it is that there are a lot of people that are extremely passionate about software development and are striving for better tools and ways to express themselves. I want to believe that through the vitriol encountered in some of these articles that there are people really trying to improve the technologies at heart instead of taking part of some kind of programing language apologetics. In regards to PHP, I think that the ridicule has led to improvements in the language, but the overall tone in some of these articles is still a turn off for me.
1. apt-get install php5 ? Seriously, that's it. On the other hand, neither Debian stable nor Ubuntu LTS have any usable version of node in their package repository (Debian has nothing, Ubuntu has 0.6)
4. json_decode() ?
5. If Atwood's law ever becomes reality, it will be a consequence, not a source of benefit.
(I don't use either Node or PHP as my main language)
Any decent compile-time optimizer will transform your first snippet into the second one (or better). Some languages preclude that optimization at compile time, but I presume that a JIT would also have little problem performing that optimization.
That is, one could argue that a good language is one that lets developers ignore trivial changes like this without hurting performance.
At some point, you'd expect these arbitrary this vs. that comparisons to die off. They haven't, and I'm guessing they won't.
Basically, it comes down to picking the tool that best supports your use case, or being okay with a compromise. Like the SQL/NoSQL discussions recently... Use it poorly and you get poor results.
But the reason for this wasn't that Node/JS is faster than PHP; it was because I was able to write the Node.js app asynchronously, but the PHP version was making hundreds of synchronous requests (this is the gist of the OP).
The issue I have is that Node.js makes asynchronous http calls relatively easy, whereas in PHP, using curl_multi_exec is kludgy, and few libraries support asynchronous requests.
The situation is changing, but the fact remains that asynchronous code is the norm in Node.js, while blocking code is the norm in PHP. This makes it more difficult (as of this writing) to do any non-trivial asynchronous work in PHP.
I agree that the comparisons are often unfair between languages/frameworks, and agree with everything phil says, but there is a lot to be said for language level non-blocking constructs.
I am really enjoying reading Go code and seeing how people use concurrency etc; and they are all doing it the same. When I would read ruby, I would have to know the particulars of a library like Celluloid or EventMachine which made it harder.
The "Thoughts" section was the most informative part of the benchmark which underscores the way I, when I was working with PHP, operated. When I started with PHP(2005), the frameworks were terrible, I would cobble together many random coding examples from stuff I found on the web and just make my own Framework up. I don't think PHP from a performance standpoint is any better or worse, but the default examples that you generally see in the ecosystem provide significantly worse performance. The one thing that Node clearly has an upper hand on PHP with is the ecosystem. It's a lot easier for a developer new to the Node ecosystem to hit that Node target than it would be for someone of the same skill to hit the PHP target in terms of hours spent.
One funny thing is that the ReactPHP[1] site is visually similar to the Node[2] homepage.
I have used RollingCurl (non blocking CURL) to fetch multiple API requests at once using PHP. Really easy to implement using a simple class. The example shows how you could build a simple efficient scraper.
A lot of the components are in production already, it was built by the original developers to be used in production. It's on 0.3.0 for many parts, which is no further behind where Node was when people started flapping about it :)
> Even if Node was 5x slower than PHP I would still go for Node because of its easy jQuery syntax
That "jQuery syntax" has nothing to do with the language itself. jQuery uses Sizzle[0], which is a CSS selector library for JavaScript. There are plenty of PHP libraries which provide CSS selectors, such as the Symfony CssSelector component[1].
The argument you really should be making is that the Javascript syntax is familiar. jQuery and it's methods for traversing the DOM can trivially be implemented in any langauge. e.g. PHP:
eknkc|12 years ago
require('http').globalAgent.maxSockets = 64;
at the top of node script if you want a fair comparison with async php version. The bottleneck is bandwidth here. Not the runtime.
On my laptop, original script from the author took 35 seconds to complete.
With maxAgents = 64, it took 10 seconds.
Edit: And who is downvoting this? I just provided actual numbers and a way to reproduce them. If you don't like how the universe works, don't take it out on me.
STRML|12 years ago
philstu|12 years ago
NodeJS v0.10.21 + Cheerio
real 0m47.986s user 0m7.252s sys 0m1.080s
NodeJS v0.10.21 + Cheerio + 64 connections
real 0m14.475s user 0m8.853s sys 0m1.696s
PHP 5.5.5 + ReactPHP + phpQuery
real 0m15.989s user 0m11.125s sys 0m1.668s
Considerably quicker! As I said I was sure NodeJS could go faster, but the point of the article was that PHP itself is not just magically 4 times slower, it is in fact almost identical when you use almost identical approaches. :)
philstu|12 years ago
dalore|12 years ago
> Update: A few people have mentioned that Node by default will use maxConnections of 5, but setting it higher would make NodeJS run much quicker. As I said, im sure NodeJS could go faster - I would never make assumptions about something I don't know much about - and the numbers reflect that suggestions. Removing the blocking PHP approach (because obviously it's slow as shit) and running just the other three scripts looks like this:
CJefferson|12 years ago
Also, that seems like a bit of a magic flag to add / tune, why is that not the default, and would I have to keep tuning it for each of my apps?
filipedeschamps|12 years ago
Is this something safe to raise?
dubcanada|12 years ago
However you want to know what this benchmark proves? Absolutely nothing as it has to query a website. So the response time of the website matters more then this test.
gopalv|12 years ago
The real trick here is async processing. A lot of the slow bits of PHP code is people not writing async data patterns.
If you use synchronous calls in PHP - mc::get or mysql or curl calls, then PHP absolutely sucks in performance.
Nodejs automatically trains you around this with a massive use of callbacks for everything. That is the canonical way to do things - while in PHP blocking single-threaded calls is what everyone uses.
The most satifying way to actually get PHP to perform well is to use async PHP with a Future result implementation. To be able to do a get() on a future result was the only sane way to mix async data flows with PHP.
For instance, I had a curl implementation which fetched multiple http requests in parallel and essentially lets the UI wait for each webservices call at the html block where it was needed.
https://github.com/zynga/zperfmon/blob/master/server/web_ui/...
There was a similar Memcache async implementation, particularly for the cache writebacks (memcache NOREPLY). Memcache multi-get calls to batch together key fetches and so on.
The real issue is that this is engineering work on top of the language instead of being built into the "one true way".
So often, I would have to dig in and rewrite massive chunks of PHP code to hide latencies and get near the absolute packet limits of the machines - getting closer to the ~3500 to 4000 requests per-second on a 16 core machine (sigh, all of that might be dead & bit-rotting now).
Osiris|12 years ago
senorcastro|12 years ago
tegeek|12 years ago
1. Takes 1 minute to install on any platform (*nix, windows etc.)
2. A modern Package Manager (NPM) works seamlessly with all platforms.
3. All libraries started from 0 with async baked in from day 0.
4. No need to use any 3rd party JSON serialize/deserialize libs.
5. And above all, its Atwood's law
"any application that can be written in JavaScript, will eventually be written in JavaScript".
http://www.codinghorror.com/blog/2009/08/all-programming-is-...
Wilya|12 years ago
1. apt-get install php5 ? Seriously, that's it. On the other hand, neither Debian stable nor Ubuntu LTS have any usable version of node in their package repository (Debian has nothing, Ubuntu has 0.6)
4. json_decode() ?
5. If Atwood's law ever becomes reality, it will be a consequence, not a source of benefit.
(I don't use either Node or PHP as my main language)
debaserab2|12 years ago
1. You have to be kidding, right? PHP's popularity is precisely because of this.
2. getcomposer.org
4. json_decode/json_encode have been a part of PHP since PHP 5.2 (2006)
5. That's not a benefit.
RossM|12 years ago
1. My recent installs of node have required compiling from source to get anything remotely up-to-date, however there are packages for both,
2. Composer with the Packagist registry is comparable here - you might be thinking of PEAR.
3. JS certainly has much better async support - it being JavaScript after all.
4. PHP has JSON encoding/decoding bundled, no third party lib required.
5. For better or worse
octo_t|12 years ago
I'd trade decent JSON support for decent XML support every single day of the week.
And Scala/Java/JVM have already solved the problems you mention above.
jbeja|12 years ago
balac|12 years ago
unknown|12 years ago
[deleted]
killnine|12 years ago
[deleted]
ohwp|12 years ago
For example:
vs Most of the time benchmarks prove how capable a programmer is, not the speed of the language used.humanrebar|12 years ago
That is, one could argue that a good language is one that lets developers ignore trivial changes like this without hurting performance.
disdev|12 years ago
At some point, you'd expect these arbitrary this vs. that comparisons to die off. They haven't, and I'm guessing they won't.
Basically, it comes down to picking the tool that best supports your use case, or being okay with a compromise. Like the SQL/NoSQL discussions recently... Use it poorly and you get poor results.
idProQuo|12 years ago
CmonDev|12 years ago
geerlingguy|12 years ago
But the reason for this wasn't that Node/JS is faster than PHP; it was because I was able to write the Node.js app asynchronously, but the PHP version was making hundreds of synchronous requests (this is the gist of the OP).
The issue I have is that Node.js makes asynchronous http calls relatively easy, whereas in PHP, using curl_multi_exec is kludgy, and few libraries support asynchronous requests.
The situation is changing, but the fact remains that asynchronous code is the norm in Node.js, while blocking code is the norm in PHP. This makes it more difficult (as of this writing) to do any non-trivial asynchronous work in PHP.
lukeholder|12 years ago
I am really enjoying reading Go code and seeing how people use concurrency etc; and they are all doing it the same. When I would read ruby, I would have to know the particulars of a library like Celluloid or EventMachine which made it harder.
joeblau|12 years ago
One funny thing is that the ReactPHP[1] site is visually similar to the Node[2] homepage.
[1] - http://reactphp.org/ [2] - http://nodejs.org/
girvo|12 years ago
wooptoo|12 years ago
erikig|12 years ago
dude3|12 years ago
alextingle|12 years ago
ausjke|12 years ago
philstu|12 years ago
A lot of the components are in production already, it was built by the original developers to be used in production. It's on 0.3.0 for many parts, which is no further behind where Node was when people started flapping about it :)
dubcanada|12 years ago
hugofirth|12 years ago
onion2k|12 years ago
alexyoung|12 years ago
jlebrech|12 years ago
denysonique|12 years ago
Scraping using jQuery syntax such as:
is more familiar to most web developers as opposed to the PHP syntax.Even if Node was 5x slower than PHP I would still go for Node because of its easy jQuery syntax.
jqueryin|12 years ago
* cheerio (https://github.com/MatthewMueller/cheerio)
* PhpQuery (https://code.google.com/p/phpquery/wiki/jQueryPortingState)
Both of these use a jQuery-esque syntax, so your comment regarding DOM traversal in PHP is a moot point.
wldlyinaccurate|12 years ago
That "jQuery syntax" has nothing to do with the language itself. jQuery uses Sizzle[0], which is a CSS selector library for JavaScript. There are plenty of PHP libraries which provide CSS selectors, such as the Symfony CssSelector component[1].
[0] https://github.com/jquery/sizzle
[1] https://github.com/symfony/CssSelector
deanclatworthy|12 years ago
http://symfony.com/doc/current/components/dom_crawler.html#n...