The main problem with Vulkan isn't the programming model or the lack of features. These are tackled by Khronos. The problem is with coverage and update distribution. It's all over the place! If you develop general purpose software (like Zed), you can't assume that even the basic things like dynamic rendering are supported uniformly. There are always weird systems with old drivers (looking at Ubuntu 22 LTS), hardware vendors abandoning and forcefully deprecating the working hardware, and of course driver bugs...
So, by the time I'm going to be able to rely on the new shiny descriptor heap/buffer features, I'll have more gray hair and other things on the horizon.
This is why I try to encourage new Linux users away from Ubuntu: it's a laggard with, often important, functionality. It is now an enterprise OS (where durability is more important than functionality), it's not really suitable for a power user (like someone who would use Zed).
> There are always weird systems with old drivers (looking at Ubuntu 22 LTS)
While I agree with your general point, RHEL stands out way, way more to me. Ubuntu 22.04 and RHEL 9 were both released in 2022. Where Ubuntu 22.04 has general support until mid-2027 and security support until mid-2032, RHEL 9 has "production" support through mid-2032 and extended support until mid-2034.
Yes, this is the problem. They tout this new latest and greatest extension that fixes and simplifies a lot, yet you go look up the extension on vulkan.gpuinfo.org and see ... currently 0.3% of all devices support it. Which means you can't in any way use it. So you wait 5 years, and now maybe 20% of devices support it. Then you wait another 5 years, and maybe 75% of devices support it. And maybe you can get away with limiting your code to running on 75% of devices. Or, you wait another 5 years to get into the 90s.
Some just ignore it and require using recent Vulkan (see for example dxvk and etc.). Do that. Ubuntu LTS isn't something you should be using for graphics dependent desktop scenarios anyway. Limiting features based on that is a bad idea.
I wish they would just allow us to push everything to GPU as buffer pointers, like buffer_device address extension allows you to, and then reconstruct the data to your required format via shaders.
The GPU programming seems to be both super low level, but also high level, cause textures and descriptors need these ultra specific data format's, and then the way you construct and upload those formats are very complicated and change all the time.
Is there really no way to simplify this ?
Regular vertex data was supposed to be strictly pre formatted in pipeline too, util it was not suddenly, and now we can just give the shader a `device_address`extension memory pointer and construct the data from that.
I also want what you're describing. It seems like the ideal "data-in-out" pipeline for purely compute based shaders.
I've brought it up several times when talking with folks who work down in the chip level for optimizing these operations and all I can say is, there are a lot of unforeseen complications to what we're suggesting.
It's not that we can't have a GPU that does these things, it's apparently more of a combination of previous and current architectural decisions that don't want that. For instance, an nVidia GPU is focused on providing the hardware optimizations necessary to do either LLM compute or graphics acceleration, both essentially proprietary technologies.
The proprietariness isn't why it's obtuse though, you can make a chip go super-duper fast for specific tasks, or more general for all kinds of tasks. Somewhere, folks are making a tradeoff of backwards compatibility and supporting new hardware accelerated tasks.
Neither of these are "general purpose compute and data flow" focuses. As such, you get the GPU that only sorta is configurable for what you want to do. Which in my opinion explains your "GPU programming seems to be both super low level, but also high level" comment.
That's been my experience. I still think what you're suggesting is a great idea and would make GPU's a more open compute platform for a wider variety of tasks, while also simplifying things a lot.
If you got what you're asking for you'd presumably lose access to any fixed function hardware. RE your example, knowing the data format permits automagic hardware accelerated translations between image formats.
You're free to do what you're asking after by simply performing all operations manually in a compute shader. You can manually clip, transform, rasterize, and even sample textures. But you'll lose the implicit use of various fixed function hardware that you currently benefit from.
I’m not watching Rust as closely as I once did, but it seems like buffer ownership is something it should be leaning on more fully.
There’s an old concurrency pattern where a producer and consumer tag team on two sets of buffers to speed up throughput. Producer fills a buffer, transfers ownership to the consumer, and is given the previous buffer in return.
It is structurally similar to double buffered video, but for any sort of data.
It seems like Rust would be good for proving the soundness. And it should be a library now rather than a roll your own.
Just yesterday I watched this video: https://m.youtube.com/watch?v=7bSzp-QildA I am not a graphics programmer, but from what I understood I think he talks about doing what you are describing with Vulkan.
At least they are making an effort to correct the extension spaghetti, already worse than OpenGL.
Addiitionally most of these fixes aren't coming into Android, now getting WebGPU for Java/Kotlin[0] after so many refused to move away from OpenGL ES, and naturally any card not lucky to get new driver releases.
As someone from game development, not supporting Vulkan on Android and sticking with OpenGL ES instead is a safer bet. There is always some device(s) that bug out on Vulkan badly. Nobody wants to sit and find workarounds for that obscure vendor.
Bizarre take. Notice how that WebGPU is an AndroidX library? That means WebGPU API support is built into apps via that library and runs on top of the system's Vulkan or OpenGL ES API.
Do you work for Google or an Android OEM? If not, you have no basis to make the claim that Android will cease updating Vulkan API support.
I'm really enjoying these changes. Going from render passes to dynamic rendering really simplified my code. I wonder how this new feature compares to existing bindless rendering.
From the linked video, "Feature parity with OpenCL" is the thing I'm most looking forward to.
You can use descriptor heaps with existing bindless shaders if you configure the optional "root signature".
However it looks like it's simpler to change your shaders (if you can) to use the new GLSL/SPIR-V functionality (or Slang) and don't specify the root signature at all (it's complex and verbose).
Descriptor heaps really reduce the amount of setup code needed, with pipeline layouts gone you can drop like third of the code needed to get started.
I suspect we are only 5-10 years away until Vulkan is finaly usable. There are so many completely needlessly complex things, or things that should have an easy-path for the common case.
BDA, dynamic rendering and shader objects almost make Vulkan bearable. What's still sorely missing is a single-line device malloc, a default queue that can be used without ever touching the queue family API, and an entirely descriptor-free code path. The latter would involve making the NV bindless extension the standard which simply gives you handles to textures, without making you manage descriptor buffers/sets/heaps. Maybe also put an easy-path for synchronization on that list and making the explicit API optional.
Until then I'll keep enjoying OpenGL 4.6, which already had BDA with c-style pointer syntax in glsl shaders since 2010 (NV_shader_buffer_load), and which allows hassle-free buffer allocation and descriptor-set-free bindless textures.
So this goes into Vulkan. Then it has to ship with the OS. Then it has to go into intermediate layers such as WGPU. Which will probably have to support both old and new mode. Then it has to go into renderers. Which will probably have to support both old and new mode. Maybe at the top of the renderer you can't tell if you're in old or new mode, but it will probably leak through. In that case game engines have to know about this. Which will cause churn in game code.
And Apple will do something different, in Metal.
Unreal Engine and Unity have the staffs to handle this, but few others do.
The Vulkan-based renderers which use Vulkan concurrency to get performance OpenGL can't deliver are few. Probably only Unreal Engine and Unity really exploit Vulkan properly.
Here's the top level of the Vulkan changes.[1] It doesn't look simple.
(I'm mostly grumbling because the difficulty and churn in Vulkan/WGPU has resulted in three abandoned renderers in Rust land through developer burnout. I'm a user of renderers, and would like them to Just Work.)
descriptor sets are realistically never getting deprecated. old code doesn't have to be rewritten if it works. there's no point.
if you're doing bindless (which you most certainly arent if you're still stuck with descriptor sets) this offers a better way of handling that.
if you care to upgrade your descriptor set based path to use heaps, this extension offers a very nice pathway to doing so _without having to even recompile shaders_.
for new/future code, this is a solid improvement.
if you're happy where you are with your renderer, there isn't a need to do anything.
I would like to / am "supposed to" use Vulkan but it's a massive pain coming from OpenCL, with all kinds of issues that need safe handling which simply don't come from OpenCL workloads.
Everyone keeps telling me OpenCL is deprecated (which is true, although it's also true that it continues to work superbly in 2026) but there isn't a good / official OpenCL to Vulkan wrapper out there to justify it for what I do.
This is my point of view as someone who learned WebGPU as a precursor to learning Vulkan, and who is definitely not a graphics programming expert:
My personal experience with WebGPU wasn't the best. One of my dislikes was pipelines, which is something that other people also discuss in this comment thread. Pipeline state objects are awkward to use without an extension like dynamic rendering. You get a combinatorial explosion of pipelines and usually end up storing them in a hash map.
In my opinion, pipelines state objects are a leaky abstraction that exposes the way that GPUs work: namely that some state changes may require some GPUs to recompile the shader, so all of the state should be bundled together. In my opinion, an API for the web should be concerned with abstractions from the point of view of the programmer designing the application: which state logically acts as a single unit, and which state may change frequently?
It seems that many modern APIs have gone with the pipeline abstraction; for example, SDL_GPU also has pipelines. I'm still not sure what the "best practices" are supposed to be for modern graphics programming regarding how to structure your program around pipelines.
I also wish that WebGPU had push constants, so that I do not have to use a bind group for certain data such as transformation matrices.
Because WebGPU is design-by-committee and must support the lowest common denominator hardware, I'm worried whether it will evolve too slowly to reflect whatever the best practices are in "modern" Vulkan. I hope that WebGPU could be a cross-platform API similar to Vulkan, but less verbose. However, it seems to me that by using WebGPU instead of Vulkan, you currently lose out on a lot of features. Since I'm still a beginner, I could have misconceptions that I hope other people will correct.
WebGPU is kinda meh, a 2010s graphic programmers vision of a modern API. It follows Vulkan 1.0, and while Vulkan is finally getting rid of most of the mess like pipelines, WebGPU went all in. It's surprisingly cumbersome to bind stuff to shaders, and everything is static and has to be hashed&cached, which sucks for streaming/LOD systems. Nowadays you can easily pass arbitrary amounts of buffers and entire scene descriptions via GPU memory pointers to OpenGL, Vulkan, CUDA, etc. with BDA and change them dynamically each frame. But not in WebGPU which does not support BDA und is unlikely to support it anytime soon.
It's also disappointing that OpenGL 4.6, released in 2017, is a decade ahead of WebGPU.
As always, the only two positive things about WebGL and WebGPU, are being available on browsers, and having been designed for managed languages.
They lag behind modern hardware, and after almost 15 years, there are zero developer tools to debug from browser vendors, other than the oldie SpectorJS that hardly counts.
I think in the end it all depends on Android. Average Vulkan driver quality on Android doesn't seem to be great in the first place, getting uptodate Vulkan API support, and in high quality and high enough performance for a modernized WebGPU version to build on might be too much to ask of the Android ecosystem for the next one or two decades.
I try my best to push ML things into WebGPU and I think it has a future, but performance is not there yet. I have little experience with Vulkan except toy projects, but WebGPU and Vulkan seem very similar
WebGPU is kinda meh. It's when you need to do do something on browser that you can't with WebGL. GLES is the compatibility king and runs pretty much everywhere, if not natively then through a compatibility layer like ANGLE. I'm sad that WebGPU killed WebGL 3 which was supposed to add compute shaders. Maybe WebGPU would've been more interesting if it wasn't made to replace WebGL but instead be a non-compatibility API targetting modern rendering and actually supporting Spir-V.
Yes, you can get very close to that API with this extension + existing Vulkan extensions. The main difference is that you still kind of need opaque buffer and texture objects instead of raw pointers, but you can get GPU pointers for them and still work with those. In theory I think you could do the malloc API design there but it's fairly unintuitive in Vulkan and you'd still need VkBuffers internally even if you didn't expose them in a wrapper layer.
I've got a (not yet ready for public) wrapper on Vulkan that mostly matches this blog post, and so far it's been a really lovely way to do graphics programming.
The main thing that's not possible at all on top of Vulkan is his signals API, which I would enjoy seeing - it could be done if timeline semaphores could be waited on/signalled inside a command buffer, rather than just on submission boundaries. Not sure how feasible that is with existing hardware though.
It's a baby-step in this direction, e.g. from Seb's article:
> Vulkan’s VK_EXT_descriptor_buffer (https://www.khronos.org/blog/vk-ext-descriptor-buffer) extension (2022) is similar to my proposal, allowing direct CPU and GPU write. It is supported by most vendors, but unfortunately is not part of the Vulkan 1.4 core spec.
The new `VK_EXT_descriptor_heap` extension described in the Khronos post is a replacement for `VK_EXT_descriptor_buffer` which fixes some problems but otherwise is the same basic idea (e.g. "descriptors are just memory").
I personally just switched to using push descriptors everywhere. On desktops, the real world limits are high enough that it end up working out fine and you get a nice immediate mode API like OpenGL.
My understanding of API standards that need to be implemented by multiple vendors is that there's a tradeoff between having something that's easy for the programmer to use and something that's easy for vendors to implement.
A big complaint I hear about OpenGL is that it has inconsistent behavior across drivers, which you could argue is because of the amount of driver code that needs to be written to support its high-level nature. A lower-level API can require less driver code to implement, effectively moving all of that complexity into the open source libraries that eventually get written to wrap it. As a graphics programmer you can then just vendor one of those libraries and win better cross-platform support for free.
For example: I've never used Vulkan personally, but I still benefit from it in my OpenGL programs thanks to ANGLE.
Agreed. It has way too much completely unnecessary verbosity. Like, why the hell does it take 30 lines to allocate memory rather than one single malloc.
Uuugh, graphics. So many smart people expending great energy to look busy while doing nothing particularly profound.
Graphics people, here is what you need to do.
1) Figure out a machine abstraction.
2) Figure out an abstraction for how these machines communicate with each other and the cpu on a shared memory bus.
3) Write a binary spec for code for this abstract machine.
4) Compilers target this abstract machine.
5) Programs submit code to driver for AoT compilation, and cache results.
6) Driver has some linker and dynamic module loading/unloading capability.
7) Signal the driver to start that code.
AMD64, ARM, and RISC-V are all basically differing binary specs for a C-machine+MMU+MMIO compute abstraction.
Figure out your machine abstraction and let us normies write code that’s accelerated without having to throw the baby out with the bathwater ever few years.
Oh yes, give us timing information so we can adapt workload as necessary to achieve soft real-time scheduling on hardware with differing performance.
Wow, you should get NVIDIA, AMD and Intel on the phone ASAP! Really strange that they didn't come up with such a simple and straightforward idea in the last 3 decades ;)
I don’t know which of my detractors to respond to, so I’ll respond here.
It should be clear that I’m only interested in compute and not a GPU expert.
GPUs, from my understanding, have lost the majority of fixed-function units as they’ve become more programmable. Furthermore, GPUs clearly have a hidden scheduler and this is not fully exposed by vendors. In other words we have no control over what is being run on a GPU at any given instant, we simply queue work for it.
Given all these contrivances, why should not the interface exposed to the user be absolutely simple. It should then be up to vendors to produce hardware (and co-designed compilers) to run our software as fast as possible.
Graphics developers need to develop a narrow-waist abstraction for wide, latency-hiding, SIMD compute. On top of this Vulkan, or OpenGL, or ML inference, or whatever can be done. The memory space should also be fully unified.
This is what needs to be worked on. If you don’t agree, that’s fine, but don’t pretend that you’re not protecting entrenched interests from the likes of Microsoft, Nvidia, Epic Games, Valve and others.
Telling people to just use Unreal engine, or Unity, or even Godot, it just like telling people to just use Python, or Typescript, or Go to get their sequential compute done.
kvark|20 days ago
zamalek|19 days ago
This is why I try to encourage new Linux users away from Ubuntu: it's a laggard with, often important, functionality. It is now an enterprise OS (where durability is more important than functionality), it's not really suitable for a power user (like someone who would use Zed).
MereInterest|19 days ago
While I agree with your general point, RHEL stands out way, way more to me. Ubuntu 22.04 and RHEL 9 were both released in 2022. Where Ubuntu 22.04 has general support until mid-2027 and security support until mid-2032, RHEL 9 has "production" support through mid-2032 and extended support until mid-2034.
Wikipedia sources for ubuntu[0] and RHEL [1]:
[0] https://en.wikipedia.org/wiki/Ubuntu#Releases
[1] https://upload.wikimedia.org/wikipedia/en/timeline/fcppf7prx...
thegrim000|19 days ago
m-schuetz|19 days ago
xyzsparetimexyz|19 days ago
CAP_NET_ADMIN|18 days ago
shmerl|19 days ago
tonis2|19 days ago
The GPU programming seems to be both super low level, but also high level, cause textures and descriptors need these ultra specific data format's, and then the way you construct and upload those formats are very complicated and change all the time.
Is there really no way to simplify this ?
Regular vertex data was supposed to be strictly pre formatted in pipeline too, util it was not suddenly, and now we can just give the shader a `device_address`extension memory pointer and construct the data from that.
softfalcon|19 days ago
I've brought it up several times when talking with folks who work down in the chip level for optimizing these operations and all I can say is, there are a lot of unforeseen complications to what we're suggesting.
It's not that we can't have a GPU that does these things, it's apparently more of a combination of previous and current architectural decisions that don't want that. For instance, an nVidia GPU is focused on providing the hardware optimizations necessary to do either LLM compute or graphics acceleration, both essentially proprietary technologies.
The proprietariness isn't why it's obtuse though, you can make a chip go super-duper fast for specific tasks, or more general for all kinds of tasks. Somewhere, folks are making a tradeoff of backwards compatibility and supporting new hardware accelerated tasks.
Neither of these are "general purpose compute and data flow" focuses. As such, you get the GPU that only sorta is configurable for what you want to do. Which in my opinion explains your "GPU programming seems to be both super low level, but also high level" comment.
That's been my experience. I still think what you're suggesting is a great idea and would make GPU's a more open compute platform for a wider variety of tasks, while also simplifying things a lot.
jsheard|19 days ago
Even on modern hardware there's still a lot of architectural differences to reconcile at the API level.
fc417fc802|19 days ago
You're free to do what you're asking after by simply performing all operations manually in a compute shader. You can manually clip, transform, rasterize, and even sample textures. But you'll lose the implicit use of various fixed function hardware that you currently benefit from.
hinkley|19 days ago
There’s an old concurrency pattern where a producer and consumer tag team on two sets of buffers to speed up throughput. Producer fills a buffer, transfers ownership to the consumer, and is given the previous buffer in return.
It is structurally similar to double buffered video, but for any sort of data.
It seems like Rust would be good for proving the soundness. And it should be a library now rather than a roll your own.
qskousen|19 days ago
pjmlp|20 days ago
Addiitionally most of these fixes aren't coming into Android, now getting WebGPU for Java/Kotlin[0] after so many refused to move away from OpenGL ES, and naturally any card not lucky to get new driver releases.
Still, better now than never.
[0] - https://developer.android.com/jetpack/androidx/releases/webg...
viktorcode|19 days ago
tadfisher|19 days ago
Do you work for Google or an Android OEM? If not, you have no basis to make the claim that Android will cease updating Vulkan API support.
kllrnohj|19 days ago
[deleted]
hmry|20 days ago
From the linked video, "Feature parity with OpenCL" is the thing I'm most looking forward to.
exDM69|20 days ago
However it looks like it's simpler to change your shaders (if you can) to use the new GLSL/SPIR-V functionality (or Slang) and don't specify the root signature at all (it's complex and verbose).
Descriptor heaps really reduce the amount of setup code needed, with pipeline layouts gone you can drop like third of the code needed to get started.
Similar in magnitude to dynamic rendering.
m-schuetz|20 days ago
BDA, dynamic rendering and shader objects almost make Vulkan bearable. What's still sorely missing is a single-line device malloc, a default queue that can be used without ever touching the queue family API, and an entirely descriptor-free code path. The latter would involve making the NV bindless extension the standard which simply gives you handles to textures, without making you manage descriptor buffers/sets/heaps. Maybe also put an easy-path for synchronization on that list and making the explicit API optional.
Until then I'll keep enjoying OpenGL 4.6, which already had BDA with c-style pointer syntax in glsl shaders since 2010 (NV_shader_buffer_load), and which allows hassle-free buffer allocation and descriptor-set-free bindless textures.
bvjgkbl|19 days ago
- with DXVK to play games - with llama.cpp to run local LLMs
Vulkan is already everywhere, from games to AI.
Animats|19 days ago
So this goes into Vulkan. Then it has to ship with the OS. Then it has to go into intermediate layers such as WGPU. Which will probably have to support both old and new mode. Then it has to go into renderers. Which will probably have to support both old and new mode. Maybe at the top of the renderer you can't tell if you're in old or new mode, but it will probably leak through. In that case game engines have to know about this. Which will cause churn in game code.
And Apple will do something different, in Metal.
Unreal Engine and Unity have the staffs to handle this, but few others do. The Vulkan-based renderers which use Vulkan concurrency to get performance OpenGL can't deliver are few. Probably only Unreal Engine and Unity really exploit Vulkan properly.
Here's the top level of the Vulkan changes.[1] It doesn't look simple.
(I'm mostly grumbling because the difficulty and churn in Vulkan/WGPU has resulted in three abandoned renderers in Rust land through developer burnout. I'm a user of renderers, and would like them to Just Work.)
[1] https://docs.vulkan.org/refpages/latest/refpages/source/VK_E...
nicebyte|19 days ago
it's not.
descriptor sets are realistically never getting deprecated. old code doesn't have to be rewritten if it works. there's no point.
if you're doing bindless (which you most certainly arent if you're still stuck with descriptor sets) this offers a better way of handling that.
if you care to upgrade your descriptor set based path to use heaps, this extension offers a very nice pathway to doing so _without having to even recompile shaders_.
for new/future code, this is a solid improvement.
if you're happy where you are with your renderer, there isn't a need to do anything.
pjmlp|19 days ago
Microsoft, Sony and Nintendo as well.
pixelpoet|20 days ago
Everyone keeps telling me OpenCL is deprecated (which is true, although it's also true that it continues to work superbly in 2026) but there isn't a good / official OpenCL to Vulkan wrapper out there to justify it for what I do.
tormeh|19 days ago
jauntywundrkind|19 days ago
Once Vulkan is finally in good order, descriptor_heap and others, I really really hope we can get a WebGPU.next.
Where are we at with the "what's next for webgpu" post, from 5 quarters ago? https://developer.chrome.com/blog/next-for-webgpu https://news.ycombinator.com/item?id=42209272
hutao|19 days ago
My personal experience with WebGPU wasn't the best. One of my dislikes was pipelines, which is something that other people also discuss in this comment thread. Pipeline state objects are awkward to use without an extension like dynamic rendering. You get a combinatorial explosion of pipelines and usually end up storing them in a hash map.
In my opinion, pipelines state objects are a leaky abstraction that exposes the way that GPUs work: namely that some state changes may require some GPUs to recompile the shader, so all of the state should be bundled together. In my opinion, an API for the web should be concerned with abstractions from the point of view of the programmer designing the application: which state logically acts as a single unit, and which state may change frequently?
It seems that many modern APIs have gone with the pipeline abstraction; for example, SDL_GPU also has pipelines. I'm still not sure what the "best practices" are supposed to be for modern graphics programming regarding how to structure your program around pipelines.
I also wish that WebGPU had push constants, so that I do not have to use a bind group for certain data such as transformation matrices.
Because WebGPU is design-by-committee and must support the lowest common denominator hardware, I'm worried whether it will evolve too slowly to reflect whatever the best practices are in "modern" Vulkan. I hope that WebGPU could be a cross-platform API similar to Vulkan, but less verbose. However, it seems to me that by using WebGPU instead of Vulkan, you currently lose out on a lot of features. Since I'm still a beginner, I could have misconceptions that I hope other people will correct.
m-schuetz|19 days ago
It's also disappointing that OpenGL 4.6, released in 2017, is a decade ahead of WebGPU.
pjmlp|19 days ago
They lag behind modern hardware, and after almost 15 years, there are zero developer tools to debug from browser vendors, other than the oldie SpectorJS that hardly counts.
flohofwoe|19 days ago
yu3zhou4|19 days ago
Cloudef|19 days ago
jabl|19 days ago
rkevingibson|19 days ago
The main thing that's not possible at all on top of Vulkan is his signals API, which I would enjoy seeing - it could be done if timeline semaphores could be waited on/signalled inside a command buffer, rather than just on submission boundaries. Not sure how feasible that is with existing hardware though.
flohofwoe|19 days ago
> Vulkan’s VK_EXT_descriptor_buffer (https://www.khronos.org/blog/vk-ext-descriptor-buffer) extension (2022) is similar to my proposal, allowing direct CPU and GPU write. It is supported by most vendors, but unfortunately is not part of the Vulkan 1.4 core spec.
The new `VK_EXT_descriptor_heap` extension described in the Khronos post is a replacement for `VK_EXT_descriptor_buffer` which fixes some problems but otherwise is the same basic idea (e.g. "descriptors are just memory").
HexDecOctBin|20 days ago
exDM69|20 days ago
socalgal2|19 days ago
I'm sure the comments will be all excuses and whys but they're all nonsense. It's just a poorly thought out API.
wasmperson|19 days ago
A big complaint I hear about OpenGL is that it has inconsistent behavior across drivers, which you could argue is because of the amount of driver code that needs to be written to support its high-level nature. A lower-level API can require less driver code to implement, effectively moving all of that complexity into the open source libraries that eventually get written to wrap it. As a graphics programmer you can then just vendor one of those libraries and win better cross-platform support for free.
For example: I've never used Vulkan personally, but I still benefit from it in my OpenGL programs thanks to ANGLE.
m-schuetz|19 days ago
pjmlp|19 days ago
janlucien|19 days ago
[deleted]
openclawagent13|20 days ago
[deleted]
lucastytthhh|20 days ago
[deleted]
sxzygz|19 days ago
Graphics people, here is what you need to do.
1) Figure out a machine abstraction.
2) Figure out an abstraction for how these machines communicate with each other and the cpu on a shared memory bus.
3) Write a binary spec for code for this abstract machine.
4) Compilers target this abstract machine.
5) Programs submit code to driver for AoT compilation, and cache results.
6) Driver has some linker and dynamic module loading/unloading capability.
7) Signal the driver to start that code.
AMD64, ARM, and RISC-V are all basically differing binary specs for a C-machine+MMU+MMIO compute abstraction.
Figure out your machine abstraction and let us normies write code that’s accelerated without having to throw the baby out with the bathwater ever few years.
Oh yes, give us timing information so we can adapt workload as necessary to achieve soft real-time scheduling on hardware with differing performance.
dyingkneepad|19 days ago
flohofwoe|19 days ago
sxzygz|19 days ago
It should be clear that I’m only interested in compute and not a GPU expert.
GPUs, from my understanding, have lost the majority of fixed-function units as they’ve become more programmable. Furthermore, GPUs clearly have a hidden scheduler and this is not fully exposed by vendors. In other words we have no control over what is being run on a GPU at any given instant, we simply queue work for it.
Given all these contrivances, why should not the interface exposed to the user be absolutely simple. It should then be up to vendors to produce hardware (and co-designed compilers) to run our software as fast as possible.
Graphics developers need to develop a narrow-waist abstraction for wide, latency-hiding, SIMD compute. On top of this Vulkan, or OpenGL, or ML inference, or whatever can be done. The memory space should also be fully unified.
This is what needs to be worked on. If you don’t agree, that’s fine, but don’t pretend that you’re not protecting entrenched interests from the likes of Microsoft, Nvidia, Epic Games, Valve and others.
Telling people to just use Unreal engine, or Unity, or even Godot, it just like telling people to just use Python, or Typescript, or Go to get their sequential compute done.
Expose the compute!
nicebyte|19 days ago
surprise, it's very difficult to do across many hw vendors and classes of devices. it's not a coincidence that metal is much easier to program for.
maybe consider joining khronos since you apparently know exactly how to achieve this very simple goal...
M95D|19 days ago