top | item 34322929

Show HN: Socketify.py: Http/Https and WebSockets servers for PyPy3 and Python3

194 points| cirospaciari | 3 years ago |github.com

173 comments

order
[+] div72|3 years ago|reply
Probably should add that it's faster with PyPy as it might not be obvious on how an interpreted language like Python can beat an AOT compiled language like Go.

Interesting results nevertheless.

[+] cirospaciari|3 years ago|reply
Yeah, I'm converting to HPy and it probably will be faster in CPython too, but like you said AOT is hard to beet without JIT at least. But with CPython is faster than Golang Gin :D
[+] znpy|3 years ago|reply
I'm reading some arguments and counter-arguments in this thread, and in my opinion it kinda boils down to what point of view you're having.

If you look at it as a framework that minimises the networking overhead, then fine, it's an interesting piece of software.

If on the other hand you look at it like a "fast" web framework then things start to change and the discussion gets a bit more complicated.

So for example, you look at the source code of the applications being benchmarked (example: https://github.com/cirospaciari/socketify.py/blob/main/bench...) and you immediately see it's simply returning the string "hello world"). Which means that it's almost 100% of the time running in the fast path / best case.

My guess is that as soon as you start doing any kind of computation in the request handler in normal non-super-optimized python (trival example: validating some headers and/or checking some signatures as you would do with jwt tokens for example) then the python-vs-golang gap will start to go back to favour golang.

And then again, it boils down to what you're doing: anything io-intensive might benefit from the unetworking/uwebsocket beneath, anything cpu-intensive will benefit from the golang compiler producing native executable code.

Nice work anyways.

[+] cirospaciari|3 years ago|reply
Actually the benchmark is on https://github.com/TechEmpower/FrameworkBenchmarks with is basically this with more headers, but you are right, is not enought, i used TechEmPower because is very popular. I have some issues (new features comming) open to create an better JWT token support, database and much more, i will post these in the future!
[+] kasey_junk|3 years ago|reply
This is a python wrapper around libuv not a pure python solution (not that it’s bad but it explains the click bait title).
[+] Ensorceled|3 years ago|reply
I don't really understand this criticism nor why it's "clickbait". I have 12 "unpure" python packages in production. If my app ever needs WSGI, I'll have 13.
[+] cirospaciari|3 years ago|reply
uvicorn uses uvloop that is an wrapper to libuv ;) almost any performance focused package use native extensions (CFFI, Cython, HPy, Python CAPI etc)
[+] commitpizza|3 years ago|reply
Node.js uses libuv also but it is way slower and tbh, can't you compare node to other things because of that it is using v8 and libuv?

I don't understand this this logic.

[+] korijn|3 years ago|reply
Would a framework like uvicorn/fastapi be able to achieve similar performance if it were backed by libuv (as opposed to e.g. asyncio)?
[+] ki_|3 years ago|reply
sure it's faster.. but go gin and fiber are really slow. they are literally 90% slower than the fast go frameworks. Not that im a go fanboy, but to put things into context here, the actual context here is: python framework 5% faster than fiber and 85% slower than gnet/silverlining/gearbox.
[+] ulimn|3 years ago|reply
How is fiber slow? Can you please provide some source? According to the techempower fortune branchmark, it's the 3rd (prefork) and 6th (normal) fastest go framework and even in the all-language list it's 24th and 34th.
[+] szastamasta|3 years ago|reply
This is kind of cool, but it all makes sense if your app is a very simple CRUD that spends 99% of time just passing json around. I have this feeling that when you add some more logic there you’ll start noticing that you’re not in a compiled language any more. Your requests will get slower and your memory usage will explode.

Saying that Python is faster than Go with this as a proof looks like overreaching for Me. It only proves that wrapping C code in Python is fast. It’s an achievement, sure, but your app probably won’t be faster when you add thousands of Python lines between request and response.

[+] melenaboija|3 years ago|reply
It does not say that Python is faster than Go it clearly says "Python framework".

> It only proves that wrapping C code in Python is fast.

Yes, it proves that for this specific use case Python is faster.

[+] FpUser|3 years ago|reply
I looked at the tests. There is nothing significant going on when creating response in http server for example. Just spit "hello world". So it appears to be a test of C++ uWebSockets and uSockets libraries vs for example native Go implementation. Do something serious in request handler using Python and then see what happens.
[+] cirospaciari|3 years ago|reply
Actually uWebSockets and uSockets will perform better than this (2x at least in my local tests), I need to do a lot of copying, instancing and crossing Python GIL.

Yeah this test basically shows that Python backed by uWS is crazy fast, but is not an direct comparison to uWS to Go.

This test is just an troughput test, with is very useful to measure raw performance.

More tools like caching tools, a better database client etc is need to construct an complete scenario, and i'm working on it! (Maybe in 1 week or 2 weeks will be done)

https://github.com/TechEmpower/FrameworkBenchmarks

https://www.techempower.com/benchmarks/#section=test&runid=1...

[+] Thaxll|3 years ago|reply
When I see that: https://github.com/cirospaciari/socketify.py/blob/main/bench...

It's kind of hopeless, Python still needs to fork per core to get any performance? So if you have 8 cores you're actually running 8 processes, so 8 DB pool etc ...

[+] simonw|3 years ago|reply
Why does that feel hopeless to you?

Running one process per core has been working well for scaling huge websites for decades at this point.

[+] cirospaciari|3 years ago|reply
Python GIL is the problem for multithreading, but I think I have a solution to not need 8 DB pools, soon o will post about it. But yeah it's a waste
[+] jerf|3 years ago|reply
A good example of why web servers should be measured in seconds per request rather than requests per second. As you trim little tiny bits off the seconds per request, the requests per second go skyrocketing off to infinity, but unless you're doing zero work per request, that's a dubiously useful metric.

1,000,000 requests per second on what I think is an 8-core system (?) is 1/1,000,000 * 8 = 8us per request. 1,250,000 requests per second would be 6.4 us per request. Is this really your make or break issue? There's a set of people who can say "yes" to that. There's a set of people who think it is yes. The latter is much larger than the former.

(Although arguably the ones worst off are the ones for whom it is their make-or-break issue, but they don't realize it....)

[+] commitpizza|3 years ago|reply
Very cool, I will bookmark this since I am out on a look for a new backend framework for an api I am building.
[+] cirospaciari|3 years ago|reply
Thanks for all your support and feedback, if you want to see any new features requests, or have any questions related to the project I will be glad to help you <3
[+] js4ever|3 years ago|reply
UWebsocket, the ultimate performance level!
[+] cyber1|3 years ago|reply
Yet another "X faster than Y" where inside X all hot jobs are done by really good-tuned C or C++. Facepalm.

Without real problem-solving (business logic) written in Python, this is only lightweight wrapper on C/C++. When the number of Python code start growing in the hot path, then these blazing-amazing thruput numbers tremendously will be going down.

[+] inglor|3 years ago|reply
I'm so happy this is (currently) the top comment and people are starting to realize measuring perf with these well tuned micro-benchmarks is a sham.
[+] commitpizza|3 years ago|reply
> Yet another "X faster than Y" where inside X all hot jobs are done by really good-tuned C or C++. Facepalm.

This is the case for basically every programming language or performant library that exists. Yet, I never see this when discussions exists about node libraries or other languages for that matter. I mean you could argue that node itself is just a thin wrapper around fast C++ libraries.

Who the fuck cares if the request goes down to some compiled C++ library? I still write my logic in Python and get this benefit anyway. This is what makes it great.

[+] Waterluvian|3 years ago|reply
I think this is correct and does demonstrate that these comparisons are not all that useful.

But I also think that writing Python and then dropping down into C or C++ for performance is perfectly valid and often a great idea. Of course, there's nothing wrong with just using a different language that's a sensible middle ground between Python and C (like Go). But hold on to the baby when that bathwater is being dumped: there's nothing wrong with writing a blend of Python and C++ for real uses.

[+] carlsborg|3 years ago|reply
Would golang interfaced with the C++ implemenation be as fast? IMO one of the nice features of Python is being able to seamlessly drop to numba, or a C++ module when you need an optimized fastpath, but the rest of your application, the bulk of it, need not.
[+] t8sr|3 years ago|reply
Others point out that calling it a Python framework might be disingenuous, but only if your goal is to compare programming languages.

If you're a web developer, trying to pick a framework for websockets, you probably don't care that the Python framework is a libuv wrapper, while to Go framework is native.

So well done, I guess? I am really happy to see library authors taking performance seriously!

EDIT: This is assuming the benchmark is actually fair. I haven't looked at it, but it's not uncommon for benchmarks to be comparing apples and oranges.

[+] cirospaciari|3 years ago|reply
Python framework because is an framework for python, like most performance focused frameworks for python, this uses a lot of native code. uvicorn uses uvloop that is an wrapper for libuv.

Web developer dont care if the framework is written in native or it is an wrapper, developers just want something that works in a nice way :D

The benchmarks are in https://github.com/TechEmpower/FrameworkBenchmarks

EDIT: some preliminary results https://www.techempower.com/benchmarks/#section=test&runid=1...

[+] Ensorceled|3 years ago|reply
> So well done, I guess? I am really happy to see library authors taking performance seriously!

> EDIT: This is assuming the benchmark is actually fair. I haven't looked at it, but it's not uncommon for benchmarks to be comparing apples and oranges.

What is the point of this negativity? Why are you implying that the benchmarks might not be fair because you "haven't looked at it"?

[+] avinassh|3 years ago|reply
> EDIT: This is assuming the benchmark is actually fair. I haven't looked at it, but it's not uncommon for benchmarks to be comparing apples and oranges.

well, a real world test would be both servers parsing 100kb json payload in each message