(no title)
austinvhuang | 1 year ago
It's still early days for pushing compute use cases to WebGPU (OctoML being super early notwithstanding). There's a small matmul in the examples directory but it only has the most basic tiling optimizations. One of my goals the next few weeks is porting the transformer block kernels from llm.c - I think that will flesh out the picture far better. If there's interest, happy to collaborate + could potentially do a writeup if there's enough interest.
There's always some tradeoffs that comes with portability, but part of my goal with gpu.cpp is to create a scaffold to experiment and see how far we can push portable GPU performance.
No comments yet.