top | item 44081465

(no title)

Gimpei | 9 months ago

I’ve known some people who didn’t want to learn the syntax of numpy and did it all in loops, and the code was not easy to read. It was harder to read. The fundamental issue is that operations on high dimensional arrays are very difficult to reason about. Numpy can probably be improved, but I don’t think loops are the answer.

discuss

order

dahart|9 months ago

The point here is not that it’s loops per se, the point is that the indexing is explicit. It seems like a big win to me. The article’s ~10 non-trivial examples all make the code easier to read, and more importantly, to understand exactly what the code is doing. It is true that some operations are difficult to reason about, that’s where explicit indexing really helps. The article resonates with me because I do want to learn numpy syntax, I’ve written hundreds of programs with nympy, spent countless hours doing battle with it, and I feel like I’m no better off now than someone who’s brand new to it. The indexing is constantly confounding, nothing ever just works. Anytime you see “None” and “axis=“ inside an operation, it’s a tell: bound to be difficult to comprehend. I’m always having to guess how to use some combination of reshape, dstack, hstack, transpose, and five other shape changers I’m forgetting, just to get something to work and it’s difficult to read and understand later. It feels like there is no debugging, only rewriting. I keep reading the manual for einsum over again and I’ve used it, but I can’t explain how, why, or when to use it, it seems like this thing you have to resort to because no other indexing seems to work. The ability to do straightforward explicit non-clever indexing as if you were writing loops seems like a pretty big step forward.

collingreen|9 months ago

I involuntarily whispered "reshape" to myself near the top of your comment. Numpy is a very different way for me to think and I have similar feelings to what you're describing.

cl3misch|9 months ago

I could never understand why people use dstack, hstack and the like. I think plain np.stack and specifying the axis explicitely is easier to write and to read.

For transposes, np.einsum can be easier to read as it let's you use (single character, admittedly) axis specifiers to "name" them.

breppp|9 months ago

I've read the article and it didn't seem to me the author is suggesting loops

okigan|9 months ago

What’s a better syntax then?

tikhonj|9 months ago

The real question—to which I have absolutely no answer—is not about syntax, it's about concepts: what is a better way to think about higher-dimensional arrays rather than loops and indices? I'm convinced that something better exists and, if it existed, encoding it in a sufficiently expressive (ie probably not-Python) language would give us the corresponding syntax, but trying to come up with a better syntax without a better conceptual model won't get us very far.

Then again, maybe even that is wrong! "Notation as a tool for thought" and all that. Maybe "dimension-munging" in APL really is the best way to do these things, once you really understand it.

CamperBob2|9 months ago

English. "Write me a Python function or program that does X, Y, and Z on U and V using W." That will be the inevitable outcome of current trends, where relatively-primitive AI tools are used to write slightly more sophisticated code than would otherwise be written, which in turn is used to create slightly less-primitive AI tools.

For example, I just cut-and-pasted the author's own cri de coeur into Claude: https://claude.ai/share/1d750315-bffa-434b-a7e8-fb4d739ac89a Presumably at least one of the vectorized versions it replied with will work, although none is identical to the author's version.

When this cycle ends, high-level programs and functions will be as incomprehensible to most mainstream developers as assembly is today. Today's specs are tomorrow's programs.

Not a bad thing, really. And overdue, as the article makes all too clear. But the transition will be a dizzying one, with plenty of collateral disruption along the way.

willseth|9 months ago

why_not_both.gif