top | item 41580326

A high-performance, zero-overhead, extensible Python compiler using LLVM

241 points| wspeirs | 1 year ago |github.com

93 comments

order

haberman|1 year ago

> Non-goals: Drop-in replacement for CPython: Codon is not a drop-in replacement for CPython. There are some aspects of Python that are not suitable for static compilation — we don't support these in Codon.

This is targeting a Python subset, not Python itself.

For example, something as simple as this will not compile, because lists cannot mix types in Codon (https://docs.exaloop.io/codon/language/collections#strong-ty...):

    l = [1, 's']
It's confusing to call this a "Python compiler" when the constraints it imposes pretty fundamentally change the nature of the language.

quotemstr|1 year ago

It's not even a subset. They break foundational contracts of the Python language without technical necessity. For example,

> Dictionaries: Codon's dictionary type does not preserve insertion order, unlike Python's as of 3.6.

That's a gratuitous break. Nothing about preserving insertion order interferes with compilation, AOT or otherwise. The authors of Codon broke dict ordering because they felt like it, not because they had to.

At least Mojo merely claims to be Python-like. Unlike Codon, it doesn't claim to be Python then note in the fine print that it doesn't uphold Python contractual language semantics.

wpietri|1 year ago

Yeah, this right here would kill it for me:

> Strings: Codon currently uses ASCII strings unlike Python's unicode strings.

That rules out almost anything web-ish for me.

The use case I could imagine is places where you have a bunch of python programmers who don't really want to learn another language but you have modest amounts of very speed-sensitive work.

E.g., you're a financial trading company who has hired a lot of PhDs with data science experience. In that context, I could imagine saying, "Ok, quants, all of your production code has to work in Codon". It's not like they're programming masters anyhow, and having it be pretty Python-ish will be good enough for them.

bpshaver|1 year ago

Who is out here mixing types in a list anyway?

odo1242|1 year ago

Yeah, it feels closer to something like Cython without the python part.

jjk7|1 year ago

The differences seem relatively minor. Your specific example can be worked around by using a tuple; which in most cases does what you want.

Lucasoato|1 year ago

> Is Codon free? Codon is and always will be free for non-production use. That means you can use Codon freely for personal, academic, or other non-commercial applications.

I hope it is released under a truly open-source license in the future; this seems like a promising technology. I'm also wondering how it would match C++ performance if it is still garbage collected.

troymc|1 year ago

The license is the "Business Source License 1.1" [1].

The Business Source License (BSL) 1.1 is a software license created by MariaDB Corporation. It's designed as a middle ground between fully open-source licenses and traditional proprietary software licenses. It's kind of neat because it's a parameteric license, in that you can change some parameters while leaving the text of the license unchanged.

For codon, the "Change Date" is 2028-03-01 and the "Change License" is "Apache License, Version 2.0", meaning that the license will change to Apache2 in March of 2028. Until then, I guess you need to make a deal with Exaloop to use codon in production.

[1] https://github.com/exaloop/codon?tab=License-1-ov-file#readm...

actionfromafar|1 year ago

I immediately wonder how it compares to Shedskin¹

I can say one thing - Shedskin compiles to C++, which was very compelling to me for integrating into existing C++ products. Actually another thing too, Shedskin is Open Source under GPLv3. (Like GCC.)

1: https://github.com/shedskin/shedskin/

crorella|1 year ago

I looks like codon has less restrictions when compared to shed skin.

amelius|1 year ago

The challenge is not just to make Python faster, it's to make Python faster __and__ port the ecosystem of Python modules to your new environment.

eigenspace|1 year ago

It’s also just simply not python. It’s a separate language with a confusingly close syntax to python, but quite different semantics.

Myrmornis|1 year ago

This should be top comment. If I don't get the ecosystem then I'd just use Rust.

veber-alex|1 year ago

What's up with their benchmarks[1], it just shows benchmark names and I don't see any numbers or graphs. Tried Safari and Chrome.

[1]: https://exaloop.io/benchmarks/

sdmike1|1 year ago

The benchmark page looks to be broken, the JS console is showing some 404'd JS libs and a bad function call.

pizlonator|1 year ago

Also those are some bullshit benchmarks.

It’s not surprising that you can make a static compiler that makes tiny little programs written in a dynamic language into fast executables.

The hard part is making that scale to >=10,000 LoC programs. I dunno which static reasoning approaches codon uses, but all the ones I’m familiar with fall apart when you try to scale to large code.

That’s why JS benchmarking focused on larger and larger programs over time. Even the small programs that JS JIT writers use tend to have a lot of subtle idioms that break static reasoning, to model what happens in larger programs.

If you want to get in the business of making dynamic languages fast then the best advice I can give you is don’t use any of the benchmarks that these folks cite for your perf tuning. If you really do have to start with small programs then something like Richards or deltablue are ok, but you’ll want to diversify to larger programs if you really want to keep it real.

(Source: I was a combatant in the JS perf wars for a decade as a webkitten.)

w10-1|1 year ago

Unclear if this has been in the works longer as the graalvm LLVM build of python discussed yesterday[1]. The first HN discussion is from 2022 [3].

Any relation? Any comparisons?

Funny I can't find the license for graalvm python in their docs [2]. That could be a differentiator.

- [1] GraalVM Python on HN https://news.ycombinator.com/item?id=41570708

- [2] GraalVM Python site https://www.graalvm.org/python/

- [3] HN Dec 2022 https://news.ycombinator.com/item?id=33908576

timwaagh|1 year ago

It's a really expensive piece of software. They do not publish their prices because of it. I don't think it's reasonable to market such products onto your average dev because of it. Anyhow Cython and a bunch of others provide a free and open source alternative.

albertzeyer|1 year ago

There is also RPython (used by PyPy) (https://rpython.readthedocs.io/), which is a strict subset of Python, allowing for static analysis, specifically for the translation logic needed by PyPy. Thus, I was told that RPython is not really intended as a general purpose language/compiler but only really specifically to implement sth like PyPy.

But it's anyway maybe an interesting comparison to Codon.

jay-barronville|1 year ago

Instead of building their GPU support atop CUDA/NVIDIA [0], I’m wondering why they didn’t instead go with WebGPU [1] via something like wgpu [2]. Using wgpu, they could offer cross-platform compatibility across several graphics API’s, covering a wide range of hardware including NVIDIA GeForce and Quadro, AMD Radeon, Intel Iris and Arc, ARM Mali, and Apple’s integrated GPU’s.

They note the following [0]:

> The GPU module is under active development. APIs and semantics might change between Codon releases.

The thing is, based on the current syntax and semantics I see, it’ll almost certainly need to change to support non-NVIDIA devices, so I think it might be a better idea to just go with WebGPU compute pipelines sooner rather than later.

Just my two pennies…

[0]: https://docs.exaloop.io/codon/advanced/gpu

[1]: https://www.w3.org/TR/webgpu

[2]: https://wgpu.rs

MadnessASAP|1 year ago

Well for better or worse CUDA is the GPU programming API. If you're doing high performance GPU workloads you're almost certainly doing it in CUDA.

WebGPU while stating compute is within their design I would imagine is focused on presentation/rendering and probably not on large demanding workloads.

pjmlp|1 year ago

Because WebGPU is a API designed for browsers, targeting hardware designs from 2016.

big-chungus4|1 year ago

so, assuming I don't get integers bigger than int64, and don't use the order of build in dicts, can I just use arbitrary python code and use it with codon? Can I use external libraries? Numpy, PyTorch? Also noticed that it isn't supported on windows

shikon7|1 year ago

From the documentation of the differences with Python:

> Strings: Codon currently uses ASCII strings unlike Python's unicode strings.

That seems really odd to me. Who would use a framework nowadays that doesn't support unicode?

tony-allan|1 year ago

I would love to see LLVM/WebAssembly as a supported and documented backend!

xiaodai|1 year ago

Please stop trying to make python fast. Move over to Julia already.

jitl|1 year ago

What’s the difference between this and Cython? I think another comment already asks about shedskin.

rich_sasha|1 year ago

Cython relies heavily on the Python runtime. You cannot, for example, make a standalone binary with it. A lot of unoptimized Cython binary is just Python wrapped in C.

From a quick glance this seems to genuinely translate into native execution.

mgaunard|1 year ago

aren't there like a dozen of those already?

numba, cython, pypy...