top | item 47211298

(no title)

davidw | 7 hours ago

It started off nicely but before long you get

"The MLP (multilayer perceptron) is a two-layer feed-forward network: project up to 64 dimensions, apply ReLU (zero out negatives), project back to 16"

Which starts to feel pretty owly indeed.

I think the whole thing could be expanded to cover some more of it in greater depth.

discuss

tibbar|4 hours ago

I think the big frustration I've had in learning modern ML is that the entire owl is just so complicated. A poor explainer reads like "black box is black boxing the other black box", completely undecipherable. A mediocre-to-above-average explanation will be like "(loosely introduced concept) is (doing something that sounds meaningful) to black box", which is a little better. However, when explanations start getting more accurate, you run into the sheer volume of concepts/data transforms taking place in a transformer, and there's too much information to be useful as a pedagogical device.

growingswe|4 hours ago

I tried to include tooltips in some places that go into more depth, but I understand there's a jump. I'm not sure what will be the best way to go about it tbh

malnourish|1 hour ago

I liked the tooltips. You should define each term the first time it shows up (MLP for example).