top | item 45138088

(no title)

Alexander-Barth | 5 months ago

Actually in julia you can write kernels with a subset of the julia language:

https://cuda.juliagpu.org/stable/tutorials/introduction/#Wri...

With KernelAbstractions.jl you can actually target CUDA and ROCm:

https://juliagpu.github.io/KernelAbstractions.jl/stable/kern...

For python (or rather python-like), there is also triton (and probably others):

https://pytorch.org/blog/triton-kernel-compilation-stages/

discuss

order

davidatbu|5 months ago

Chris's claim (at least with regards to Triton) is that it avails 80% of the performance, and they're aiming for closer to 100%.