top | item 41162637

(no title)

FragmentShader | 1 year ago

> Your benchmark doesn't match the experience of people building games and applications on top of WebGPU

Here's an example of Bevy WebGL vs Bevy WebGPU:

I get 50 fps on 78k birds with WebGPU: https://bevyengine.org/examples-webgpu/stress-tests/bevymark...

I get 50 fps on 90k birds with WebGL: https://bevyengine.org/examples/stress-tests/bevymark/

So you test the difference between them with technically the same code.

(They can get 78k birds, which is way better than my triangles, because they batch 'em. I know 10k drawcalls doesn't seem good, but any 2024 computer can handle that load with ease.)

Older frameworks will get x10 better results , such as Kha (https://lemon07r.github.io/kha-html5-bunnymark/) or OpenFL (https://lemon07r.github.io/openfl-bunnymark/), but they run at lower res and this is a very CPU based benchmark, so I'm not gonna count them.

> be limited by the fill rate of your GPU

They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU.

> at which point you should see roughly the same performance across all APIs.

Nah, ANGLE (OpenGL) does just fine. Unity as well.

> a lower GPU usage could actually suggest that you're bottlenecked by the CPU

No. I have yet to see a game on my computer that uses more than 0.5% of my CPU. Games are usually GPU bound.

discuss

order

grovesNL|1 year ago

> Here's an example of Bevy WebGL vs Bevy WebGPU

I think a better comparison would be more representative of a real game scene, because modern graphics APIs is meant to optimize typical rendering loops and might even add more overhead to trivial test cases like bunnymark.

That said though, they're already comparable which seems great considering how little performance optimization WebGPU has received relative to WebGL (at the browser level). There are also some performance optimizations at the wasm binding level that might be noticeable for trivial benchmarks that haven't made it into Bevy yet, e.g., https://github.com/rustwasm/wasm-bindgen/issues/3468 (this applies much more to WebGPU than WebGL).

> They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU.

I don't know your exact test case so I can't say for sure, but if there are writes happening per draw call or something then you might have problems like this. Either way your graphics driver should be receiving roughly the same commands as you would when you use Vulkan or DX12 natively or WebGL, so there might be something else going on if the performance is a lot worse than you'd expect.

There is some extra API call (draw, upload, pipeline switch, etc.) overhead because your browser execute graphics commands in a separate rendering process, so this might have a noticeable performance effect for large draw call counts. Batching would help a lot with that whether you're using WebGL or WebGPU.

FragmentShader|1 year ago

> I think a better comparison would be more representative of a real game scene, because modern graphics APIs is meant to optimize typical rendering loops and might even add more overhead to trivial test cases like bunnymark.

I know, but that's the unique instance where I could find the same project compiled for both WebGL and WebGPU.

> Either way your graphics driver should be receiving roughly the same commands as you would when you use Vulkan or DX12 natively or WebGL, so there might be something else going

Yep, I know. I benchmarked my program with Nsight and calls are indeed native as you'd expect. I forced the Directx12 backend because the Vulkan and OpenGL ones are WAYYYY worse, they struggle even with 1000 triangles.

> That said though, they're already comparable which seems great considering how little performance optimization WebGPU has received relative to WebGL (at the browser level).

I agree. But the whole internet is marketing WebGPU as the faster thing right now, not in the future once it's optimized. The same happened with Vulkan but in reality it's a shitshow on mobile. :(

> There is some extra API call (draw, upload, pipeline switch, etc.) overhead because your browser execute graphics commands in a separate rendering process, so this might have a noticeable performance effect for large draw call counts. Batching would help a lot with that whether you're using WebGL or WebGPU.

Aha. That's kinda my point, though. It's "Slow" because it has more overhead, therefore, by default, I get less performance with more usage than I would with WebGL. Except this overhead seems to be in the native webgpu as well, not only in browsers. That's why I consider it way slower than, say, ANGLE, or a full game engine.

So, the problem after all is that by using WebGPU, I'm forced to optimize it to a point where I get less quality, more complexity and more GPU usage than if I were to use something else, due to the overhead itself. And chances are that the overhead is caused by the API itself being slow for some reason. In the future, that may change. But at the moment I ain't using it.

kaibee|1 year ago

> I have yet to see a game on my computer that uses more than 0.5% of my CPU.

Just a nitpick here, you probably have some multicore CPU while the render-dispatch code is gonna be single threaded. So that 0.5% you're seeing is the percent of total CPU usage, but you probably want the % usage of a single core.

stanleykm|1 year ago

this looks to be cpu bound. I’m not getting full gpu utilization but i am seeing the javascript thread using 100% of its time trying to render frames.

The webgpu and webgl apis are pretty different so im not sure you can call it “technically the same code”.

caspar|1 year ago

> looks to be cpu bound.

Apparently "Bevy's rendering stack is often CPU-bound"[0], so that would make sense.

To be fair that quote is somewhat out of context, but it was an easy official source to quote and I've heard the same claim repeated elsewhere too. (I'm not a Bevy user but am using Rust for fulltime indie game dev, so discount this comment appropriately.)

[0]: https://bevyengine.org/news/bevy-0-14/#gpu-frustum-culling

FragmentShader|1 year ago

> The webgpu and webgl apis are pretty different so im not sure you can call it “technically the same code”.

Isn't Bevy using WGPU under the hood, and then they just compile with it both WebGL and WebGPU? That should be the same code Bevy-wise, and any overhead or difference should be caused by either the WGPU "compiler" or the browser's WebGPU.