top | item 47081005

(no title)

mr_octopus | 11 days ago

Great writeup. The three-pass softmax is where everyone gets stuck — subtracting the max for numerical stability is one of those things you can't learn from a textbook.

The pain points you hit (byte alignment, dispatch dims, strict typing) make me wonder if there's a sweet spot between raw WGSL and "import pytorch" that keeps you close to the metal without all the papercuts.

discuss

order

No comments yet.