top | item 38513983

(no title)

taliesinb | 2 years ago

Wow, I love the interactive wizzing around and the animation, very neat! Way more explanations should work like this.

I've recently finished an unorthodox kind of visualization / explanation of transformers. It's sadly not interactive, but it does have some maybe unique strengths.

First, it gives array axis semantic names, represented in the diagrams as colors (which this post also uses). So sequence axis is red, key feature dimension is green, multihead axis is orange, etc. This helps you show quite complicated array circuits and get an immediate feeling for what is going on and how different arrays are being combined with each-other. Here's a pic of the the full multihead self-attention step for example:

https://math.tali.link/raster/052n01bav6yvz_1smxhkus2qrik_07...

It also uses a kind of generalization tensor network diagrammatic notation -- if anyone remembers Penrose's tensor notation, it's like that but enriched with colors and some other ideas. Underneath these diagrams are string diagrams in a particular category, though you don't need to know (nor do I even explain that!).

Here's the main blog post introducing the formalism: https://math.tali.link/rainbow-array-algebra

Here's the section on perceptrons: https://math.tali.link/rainbow-array-algebra/#neural-network...

Here's the section on transformers: https://math.tali.link/rainbow-array-algebra/#transformers

discuss

jimmySixDOF|2 years ago

You might also like this interactive 3D walk through explainer from PyTorch :

https://pytorch.org/blog/inside-the-matrix/