>LLVM is NOT required. BarraCUDA does its own instruction encoding like an adult.
>Open an issue if theres anything you want to discuss. Or don't. I'm not your mum.
>Based in New Zealand
Oceania sense of humor is like no other haha
The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.
The cheer amount of knowledge required to even start such project, is really something else, and prove the manual wrong on the machine language level is something else entirely.
When it comes to AMD, "no CUDA support" is the biggest "excuse" to join NVIDIA's walled garden.
Godspeed to this project, the more competition the less NVIDIA can continue destroying the PC parts pricing.
I'm still blown away that AMD hasn't made it their top priority. I've said this for years. If I was AMD I would spend billions upon billions if necessary to make a CUDA compatibility layer for AMD. It would certainly still pay off, and it almost certainly wouldn't cost that much.
I would love to see these folks working together on this to break apart nvidia's strangehold on gpu market (which, according to internet, allows them to have an insane 70% profit margins, thereby, raising costs for all users, worldwide).
The lack of CUDA support on AMD is absolutely not that AMD "couldn't" (although I certainly won't deny that their software has generally been lacking), it's clearly a strategic decision.
Supporting CUDA on AMD would only build a bigger moat for NVidia; there's no reason to cede the entire GPU programming environment to a competitor and indeed, this was a good gamble; as time goes on CUDA has become less and less essential or relevant.
Also, if you want a practical path towards drop-in replacing CUDA, you want ZLUDA; this project is interesting and kind of cool but the limitation to a C subset and no replacement libraries (BLAS, DNN, etc.) makes it not particularly useful in comparison.
Many projects turned out to be far better than proprietary because open-source doesn't have to please shareholders.
What sucks is that such projects at some point become too big, and make so much noise forcing big techs to buy them and everybody gets fuck all.
All it requires to beat proprietary walled garden, is somebody with knowledge and a will to make things happen.
Linus with git and Linux is the perfect example of it.
Fun fact, BitKeeper said fuck you to the Linux community in 2005, Linus created git within 10 days.
BitKeeper make their code opensource in 2016 but by them, nobody knew who they were lol
Well isn't that the case with a few other things? FSR4 on older cards is one example right now. AMD still won't officially support it. I think they will though. Too much negativity around it. Half the posts on r/AMD are people complaining about it.
Hey, I am actually working on making this compatible on earlier AMD's as well because I have an old gaming laptop with an RX5700m which is GFX10. I'm reading up on the ISA documentation to see where the differences are, and I'll have to adjust some binary encoding to get it to work.
I mean this with respect to the other person though please don't vibe code this if you want to contribute or keep the compiler for yourself. This isn't because I'm against using AI assistance when it makes sense it's because LLMs will really fail in this space. Theres's things in the specs you won't find until you try it and LLMs find it really hard to get things right when literal bits matter.
Don't let anyone dissuade you, it's going to be annoying but it can be done. When diffusion was new and rocm was still a mess I was manually patching a lot to get a vii, 1030, then 1200 working well enough.
It's a LOT less bad than it used to be, amd deserves serious credit. Codex should be able to crush it once you get the env going
Im parsing the features of c++ CUDA actually uses, not the full c++ spec as that would take a very large amount of time. The Compiler itself being written in c99 is just because that's how I write my C and is a separate thing.
Honestly I'm not sure how good is LLVM's support for AMD GX11 machine code. It's a pretty niche backend. Even if it exists, it may not produce ideal output. And it's a huge dependency.
Real developer never depended on AI to write good quality code, in fact, the amount of slope code flying left and right is due to LLM.
Open-source projects are being inundated with PR from AIs, not depending on them doesn't limit a project.
That project owner seems pretty knowledgeable of what is going on and keeping it free of dependencies is not an easy skill.
Many developers would have written the code with tons of dependency and copy/paste from LLM. Some call the later coding :)
I guess CUDA got a lot more traction and there isn't much of a software base written for OpenCL. Kind of what happened with Unix and Windows - You could write code for Unix and it'd (compile and) run on 20 different OSs, or write it for Windows, and it'd run on one second-tier OS that managed to capture almost all of the desktop market.
I remember Apple did support OpenCL a long time ago, but I don't think they still do.
Note that this targets GFX11, which is RDNA3. Great for consumer, but not the enterprise (CDNA) level at all. In other words, not a "cuda moat killer".
There's a lot of people in this thread that don't seem to have caught up with the fact that AMD has worked very hard on their cuda translation layer and for the most part it just works now, you can build cuda projects on amd just fine on modern hardware/software.
This is one of those projects that sounds impossible until you realize CUDA is basically C++ with some extensions and a runtime library.
The hard part isn't the language translation — it's matching NVIDIA's highly optimized libraries (cuBLAS, cuDNN, etc.). If BarraCUDA can hit even 80% of the performance on common ML workloads, that's a game changer for anyone who bought AMD hardware.
Curious about the PTX translation layer specifically. That's where most previous attempts (like ZLUDA) hit a wall.
Will this run on cards that don’t have ROCM/latest ROCM support? Because if not, its only gonna be a tiny subset of a tiny subset of cards that this will allow cuda to run on.
This is likely supremely naive but I would think the lift in getting coverage for an entire library to a target hardware's native assembly is largely a matter of mapping/translating functions, building acceptance tests and benchmarking/optimization - all three of those feel like they should be greatly assisted by LLM augmented workflows.
Running AI inference workloads on Nvidia GPUs , and the cost is a real pain point. Projects like this matter because GPU vendor lock-in directly affects what startups can afford to build. Would love to see how this performs on common inference ops like conv2d and attention layers.
[+] [-] h4kunamata|1 month ago|reply
>A will to live (optional but recommended)
>LLVM is NOT required. BarraCUDA does its own instruction encoding like an adult.
>Open an issue if theres anything you want to discuss. Or don't. I'm not your mum.
>Based in New Zealand
Oceania sense of humor is like no other haha
The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.
The cheer amount of knowledge required to even start such project, is really something else, and prove the manual wrong on the machine language level is something else entirely.
When it comes to AMD, "no CUDA support" is the biggest "excuse" to join NVIDIA's walled garden.
Godspeed to this project, the more competition the less NVIDIA can continue destroying the PC parts pricing.
[+] [-] querez|1 month ago|reply
The project owner is talking about LLVM,a compiler toolkit, not an LLM.
[+] [-] wild_egg|1 month ago|reply
Don't care though. AI can work wonders in skilled hands and I'm looking forward to using this project
[+] [-] colordrops|1 month ago|reply
[+] [-] magicalhippo|1 month ago|reply
Reminded me of the beached whale animated shorts[1].
[1]: https://www.youtube.com/watch?v=ezJG0QrkCTA&list=PLeKsajfbDp...
[+] [-] ekianjo|1 month ago|reply
[+] [-] samrus|1 month ago|reply
> The project owner strongly emphasize the no LLM dependency, in a world of AI slope this is so refreshing.
"Has tech literacy deserted the tech insider websites of silicon valley? I will not beleove it is so. ARE THERE NO TRUE ENGINEERS AMONG YOU?!"
[+] [-] renewiltord|1 month ago|reply
The scientific term for this is “gradient descent”.
[+] [-] deeringc|1 month ago|reply
https://github.com/Zaneham/BarraCUDA/blob/master/src/lexer.c...
[+] [-] freakynit|1 month ago|reply
I would love to see these folks working together on this to break apart nvidia's strangehold on gpu market (which, according to internet, allows them to have an insane 70% profit margins, thereby, raising costs for all users, worldwide).
[+] [-] piker|1 month ago|reply
> make
Beautiful.
[+] [-] parlortricks|1 month ago|reply
[+] [-] esafak|1 month ago|reply
[+] [-] bri3d|1 month ago|reply
Supporting CUDA on AMD would only build a bigger moat for NVidia; there's no reason to cede the entire GPU programming environment to a competitor and indeed, this was a good gamble; as time goes on CUDA has become less and less essential or relevant.
Also, if you want a practical path towards drop-in replacing CUDA, you want ZLUDA; this project is interesting and kind of cool but the limitation to a C subset and no replacement libraries (BLAS, DNN, etc.) makes it not particularly useful in comparison.
[+] [-] h4kunamata|1 month ago|reply
What sucks is that such projects at some point become too big, and make so much noise forcing big techs to buy them and everybody gets fuck all.
All it requires to beat proprietary walled garden, is somebody with knowledge and a will to make things happen. Linus with git and Linux is the perfect example of it.
Fun fact, BitKeeper said fuck you to the Linux community in 2005, Linus created git within 10 days.
BitKeeper make their code opensource in 2016 but by them, nobody knew who they were lol
So give it time :)
[+] [-] guerrilla|1 month ago|reply
More like wouldn't* most of the time.
Well isn't that the case with a few other things? FSR4 on older cards is one example right now. AMD still won't officially support it. I think they will though. Too much negativity around it. Half the posts on r/AMD are people complaining about it.
[+] [-] wmf|1 month ago|reply
[+] [-] ByThyGrace|1 month ago|reply
[+] [-] ZaneHam|1 month ago|reply
I mean this with respect to the other person though please don't vibe code this if you want to contribute or keep the compiler for yourself. This isn't because I'm against using AI assistance when it makes sense it's because LLMs will really fail in this space. Theres's things in the specs you won't find until you try it and LLMs find it really hard to get things right when literal bits matter.
[+] [-] monster_truck|1 month ago|reply
It's a LOT less bad than it used to be, amd deserves serious credit. Codex should be able to crush it once you get the env going
[+] [-] whizzter|1 month ago|reply
[+] [-] ZaneHam|1 month ago|reply
[+] [-] unknown|1 month ago|reply
[deleted]
[+] [-] woctordho|1 month ago|reply
[+] [-] hackyhacky|1 month ago|reply
[+] [-] h4kunamata|1 month ago|reply
Open-source projects are being inundated with PR from AIs, not depending on them doesn't limit a project.
That project owner seems pretty knowledgeable of what is going on and keeping it free of dependencies is not an easy skill. Many developers would have written the code with tons of dependency and copy/paste from LLM. Some call the later coding :)
[+] [-] exabrial|1 month ago|reply
But I digress, just a quick put around... I don't know what I'm looking at. But it's impressive.
[+] [-] rbanffy|1 month ago|reply
I guess CUDA got a lot more traction and there isn't much of a software base written for OpenCL. Kind of what happened with Unix and Windows - You could write code for Unix and it'd (compile and) run on 20 different OSs, or write it for Windows, and it'd run on one second-tier OS that managed to capture almost all of the desktop market.
I remember Apple did support OpenCL a long time ago, but I don't think they still do.
[+] [-] moffkalast|1 month ago|reply
[+] [-] kmm|1 month ago|reply
[+] [-] latchkey|1 month ago|reply
[+] [-] phoronixrly|1 month ago|reply
[+] [-] cadamsdotcom|1 month ago|reply
[+] [-] battle-racket|1 month ago|reply
[+] [-] bee_rider|1 month ago|reply
[+] [-] Alifatisk|1 month ago|reply
[+] [-] bravetraveler|1 month ago|reply
Storage capacity everywhere rejoices
[+] [-] dokyun|1 month ago|reply
[+] [-] quantumwoke|1 month ago|reply
[+] [-] skipants|1 month ago|reply
Shout out to https://github.com/vosen/ZLUDA which is also in this space and quite popular.
I got Zluda to generally work with comfyui well enough.
[+] [-] ZaneHam|1 month ago|reply
[+] [-] MATTEHWHOU|1 month ago|reply
The hard part isn't the language translation — it's matching NVIDIA's highly optimized libraries (cuBLAS, cuDNN, etc.). If BarraCUDA can hit even 80% of the performance on common ML workloads, that's a game changer for anyone who bought AMD hardware.
Curious about the PTX translation layer specifically. That's where most previous attempts (like ZLUDA) hit a wall.
[+] [-] sam_goody|1 month ago|reply
Seeing insane investments (in time/effort/knowledge/frustration) like this make me enjoy HN!!
(And there is always the hope that someone at AMD will see this and actually pay you to develop the thing.. Who knows)
[+] [-] 7speter|1 month ago|reply
[+] [-] woctordho|1 month ago|reply
[+] [-] tgtweak|1 month ago|reply
[+] [-] takeaura25|1 month ago|reply
[+] [-] pyuser583|1 month ago|reply
[+] [-] rbanffy|1 month ago|reply
[+] [-] bigyabai|1 month ago|reply