top | item 40183379

(no title)

While I applaud the OP's exposition, imagine that instead of having axis names live as single-letter variables within einsum, our arrays themselves had these names attached to their axes?

It's almost like when we moved from writing machine code to structured programming: instead of being content to document that certain registers corresponding to certain semantically meaningful quantities at particular times during program execution, we manipulated only named variables and left compilers to figure out how to allocate these to registers? We're at that point now with array programming.

https://nlp.seas.harvard.edu/NamedTensor

https://math.tali.link/rainbow-array-algebra/

https://arxiv.org/abs/2102.13196

discuss

WCSTombs|1 year ago

> imagine that instead of having axis names live as single-letter variables within einsum, our arrays themselves had these names attached to their axes?

I already replied to a couple of other comments with this, but you're exactly describing the Python library Xarray: https://docs.xarray.dev/en/stable/ I've found it to work quite well in practice, and I've gotten a lot of mileage out of it in my work.

taliesinb|1 year ago

Thanks for highlighting XArray in your other comments. Yup, XArray is great. As are Dex and the various libraries for named axis DL programming within PyTorch and Jax. I never said these things don't exist -- I even mention them in the linked blog series!

But I do think it's fair to say they are in their infancy, and there is a missing theoretical framework to explain what is going on.

I anticipate name-free array programming will eventually be considered a historical curiosity for most purposes, and everyone will wonder how we put up without it for so long.

albertzeyer|1 year ago

PyTorch also has some support for them, but it's quite incomplete and has many issues so that it is basically unusable. And its future development is also unclear. https://github.com/pytorch/pytorch/issues/60832

ctrw|1 year ago

You mean like

    res2 = dumbsum(query, key, 'batch seq_q d_model, batch seq_k d_model -> batch seq_q seq_k')