top | item 21454881

Python should take a lesson from APL: Walrus operator not needed

102 points| robomartin | 6 years ago | reply

During a recent HN discussion about the walrus operator I came to realize yet another advantage of notation. I used APL professionally for about ten years, which made it an obvious source of inspiration for an example that, in my opinion, demonstrates why the Python team missed a very valuable opportunity to take this wonderful language and start exploring the judicious introduction of notation as a valuable tool for thought (borrowing from Ken Iverson's APL paper with that title [0]).

To simplify, I'll define the desire for this walrus operator ":=" as "wanting to be able to make assignments within syntax where it was previously impossible":

    if x = 5    # was impossible

    # and now

    if x := 5  # makes is possible
A more elaborate example given in the PEP goes like this:

    Current:

    reductor = dispatch_table.get(cls)
    if reductor:
        rv = reductor(x)
    else:
        reductor = getattr(x, "__reduce_ex__", None)
        if reductor:
            rv = reductor(4)
        else:
            reductor = getattr(x, "__reduce__", None)
            if reductor:
                rv = reductor()
            else:
                raise Error(
                    "un(deep)copyable object of type %s" % cls)

    Improved:


    if reductor := dispatch_table.get(cls):
        rv = reductor(x)
    elif reductor := getattr(x, "__reduce_ex__", None):
        rv = reductor(4)
    elif reductor := getattr(x, "__reduce__", None):
        rv = reductor()
    else:
        raise Error("un(deep)copyable object of type %s" % cls)
 
At first I thought, well, just extend "=" and be done with it. The HN thread resulted in many comments against this idea. The one that made me think was this one [1]:

    These two are syntactically equal and in Python there's no 
    way a linter can distinguish between these two:

        if reductor = dispatch_table.get(cls):
        if reductor == dispatch_table.get(cls):
    
    A human being can only distinguish them through careful inspection. 
    The walrus operator not only prevents that problem, but makes 
    the intent unambiguous.
Which is a perfectly valid point. I get it.

Still, the idea of two assignment operators just didn't sit well with me. That's when I realized I had seen this kind of a problem nearly thirty years ago, with the introduction of "J". I won't get into the details unless someone is interested, I'll just say that J turned APL into ASCII soup. It was and is ugly and it completely misses the point of the very reason APL has specialized notation; the very thing Iverson highlighted in his paper [0].

Back to Python.

This entire mess could have been avoided by making one simple change that would have possibly nudged the language towards a very interesting era, one where a specialized programming notation could be evolved over time for the benefit of all. That simple change would have been the introduction and adoption of APL's own assignment operator: "←"

In other words, these two things would have been equivalent in Python:

    a ← 23

    a = 23
What's neat about this is that both human and automated tools (linters, etc.) would have no problem understanding the difference between these:

    if reductor ← dispatch_table.get(cls):
    if reductor == dispatch_table.get(cls):
And the larger example would become this:

    if reductor ← dispatch_table.get(cls):
        rv ← reductor(x)
    elif reductor ← getattr(x, "__reduce_ex__", None):
        rv ← reductor(4)
    elif reductor ← getattr(x, "__reduce__", None):
        rv ← reductor()
    else:
        raise Error("un(deep)copyable object of type %s" % cls)
This assignment operator would work everywhere and, for a period of time, the "=" operator would be retained. The good news is that old code could be updated with a simple search-and-replace. In fact, code editors could even display "=" as "←" as an option. The transition to only allowing "←" (and perhaps other symbols) could be planned for Python 4.

Clean, simple and forward-looking. That, to me, is a good solution. Today we have "=" and ":=" which, from my opinionated perspective, does not represent progress at all.

[0] http://www.eecg.toronto.edu/~jzhu/csc326/readings/iverson.pd...

[1] https://news.ycombinator.com/item?id=21426338

106 comments

order
[+] viraptor|6 years ago|reply
I fell like this is a troll so good I'm even questioning if it's really a troll. But in short: you can't require non-ascii symbols for languages which are supposed to be accessible. And <- won't work since it's a valid syntax in other contexts.
[+] ken|6 years ago|reply
I heard the same complaints about indentation for scope, 20 or 25 years ago, and yet it seems to have worked out OK for them. It's interesting how many of these rules we know to be true, aren't.

I for one am tired of programming languages trying to eke out the best ASCII art they can manage with the symbols that happened to end up on American keyboards by historical accident. Or wanting to look 97% the same as ANSI C, even though their semantics are completely different.

For the people who really cannot type anything but US-ASCII on their DEC VT100 terminals, and don't use an editor or operating system which allows them to compose other characters, there are still ASCII combinations that look more or less like a left-arrow, and which are invalid Python syntax today.

[+] aaron_m04|6 years ago|reply
It was successful trolling. Look at all the comments...
[+] catalogia|6 years ago|reply
APL generally feels like a troll, like INTERCAL, but from my interactions with APL advocates I think they are sincere.
[+] thomasahle|6 years ago|reply
Go managed to use <- still..
[+] goatinaboat|6 years ago|reply
But in short: you can't require non-ascii symbols for languages which are supposed to be accessible

While I agree that everything should be ASCII the Python community believes everything should be Unicode, since Python 3.

[+] danso|6 years ago|reply
As someone who's had to teach programming to novices, I woefully underestimated how confusing (and justifiably so) `x = 1` notation is to students who didn't have the benefit of a CS education (i.e. exposure to C/C++/Java, where you eventually take the syntax for granted). I almost considered switching to teaching R, which has the explicit assignment operator `<-` (in addition to the traditional `=`).

I wouldn't mind seeing `<-` in Python. My biggest issue with `←` is that it's too inconvenient to type, as others have pointed out.

[+] squiggleblaz|6 years ago|reply
But programmers normally don't type things out long hand. They use tools for their language which automatically add indents and autocomplete symbols. If the only alternative parse for <- in Python is "less than minus", it should be fairly straightforward to program any serious IDE to replace <- with ← in the right context.

Would programming in a language with first class variables like Haskell help you? In Haskell, x = y means "let x and y be synonyms, which the compiler can freely substitute" whereas to mutate a value you have to say something like `put x y` (x is not a variable, it merely refers to a variable). It's approach is so different from standard that it explodes my head but I wonder if it's more approachable to nubs.

[+] fred256|6 years ago|reply
Unless I misunderstand what you are proposing, if you are going to deprecate both = and := in favor of ←, why not simply deprecate = in favor of := ?
[+] BerislavLopac|6 years ago|reply
This was my first thought as well. Of course, getting rid of `=` would make the conundrum about moving Python from version 2 to version 3 seem like a quiet evening at home, but there is no practical reason why `:=` wouldn't work in all contexts as `=` (currently, a simple assignment is a syntax error).
[+] jodrellblank|6 years ago|reply
APL's assignment operator is very strange to me; simple array of numbers assigned to 'nums':

    nums←1 2 3 4 5
Then catenate (append) a 6 to the right side end of it:

          nums,←6
          nums
    1 2 3 4 5 6 
Now use position-reversed catenate to append a 6 to the left side end of it; how does that work?:

          nums,⍨←6
          nums
    6 1 2 3 4 5 6 
Now use addition-assign to add 2 to each item; how:

          nums+←2
          nums
    8 3 4 5 6 7 8 
Now assign a single value into non-consecutive locations using array subscript notation on the left side of the assignment; that's unusual:

          nums[2 4]←99
          nums
    8 99 4 99 6 7 8 
Now do a max-assignment to replace values which are less than 7 with 7:

          nums⌈←7
          nums
    8 99 7 99 7 7 8 
      
Nums reset to 1 2 3 4 5, laminated with 'hello' to pair each number with a letter:

          nums←1 2 3 4 5
          nums,[0.5]'hello'
    1 2 3 4 5
    h e l l o
          nums
    1 2 3 4 5 
Want to do that and update nums? Stick an assignment in there:

          nums←1 2 3 4 5
          nums,[0.5]←'hello'
          nums
    1 2 3 4 5
    h e l l o
It seems so odd that this can pull values out and modify them with the value being assigned and put the result back, assigning to 'nums' and modifying it, while keeping the shape and structure of it.
[+] improbable22|6 years ago|reply
For my amusement mostly, here's the Julia translation:

    nums = [1,2,3,4,5]
    nums ↑ 6              #  const ↑ = push!; const ⤉ = pushfirst!
    nums ⤉ 6  
    nums .+= 2            # addition-assign to add 2 to each item
    nums[[2,4]] .= 99     
    nums
    nums ⩠ 7              #  ⩠(a,b) = a[a.<b].=b
    nums

    nums = [1,2,3,4,5]
    nums ↔ "hello"        #  ↔(a,b) = hcat(collect(a), collect(b))
    nums = nums ↔ "hello" # re-binding nums
[+] tuukkah|6 years ago|reply
In summary, Python needs it to be written := and not ← because the latter is not ASCII and not on standard keyboard layouts. And it cannot be written <- either because that would redefine what "if x<-5" currently means, ie. "if x < -5".

Interestingly, the PEP considered the close alternatives "if x from -5" and "if -5 -> x": https://www.python.org/dev/peps/pep-0572/#alternative-spelli...

[+] jacquesm|6 years ago|reply
That's not on my keyboard. Assuming you are serious, if you think a mainstream language like Python would ever accept an operator that can't be typed then you're mistaken.
[+] urda|6 years ago|reply
I wanted to just drop a comment that I appreciate the time you took into this write up with code examples. It made understanding your POV very easy.
[+] robomartin|6 years ago|reply
Thank you. I appreciate your comment very much.
[+] ryl00|6 years ago|reply
I like the trade-off C/C++ compilers (with a reasonable amount of warnings set) made... warn about constructs like "if (a=b)" unless the user adds an extra pair of parentheses, like "if ((a=b))".
[+] jmount|6 years ago|reply
It gets even better. := is an abandoned assignment operator in R (derived from ←). It is still in the syntax with no default implementation. A few packages use it (data.table, dplyr, wrapr) for quoted assignment tasks.
[+] patrec|6 years ago|reply
Python simply should never had = as assignment, and always used := just like pascal does. Using = for assignment and == is part of C's toxic legacy.
[+] phlakaton|6 years ago|reply
I like it! Reminds me of one of my favorite Haskell extensions: https://wiki.haskell.org/Unicode-symbols

I mean, I've never attempted to actually type that beastly notation during programming, but it does print out quite beautifully!

[+] squiggleblaz|6 years ago|reply
It's not beastly. Just a few :imap's will get you there in vim.
[+] sedachv|6 years ago|reply
Smalltalk (among other languages of the 1960s and 1970s) had a left arrow symbol as the assignment operator. 95 was ← instead of _ in ASCII pre-1967.

Also, anyone pretending to be a competent computer programmer while whining about being unable to use Unicode in your text editor, please find a new hobby.

[+] squiggleblaz|6 years ago|reply
> Also, anyone pretending to be a competent computer programmer while whining about being unable to use Unicode in your text editor, please find a new hobby.

Yes. It seems like programmers are people who make the technology of the next generation using the technology of the previous generation.

[+] sloaken|6 years ago|reply
I think historically the "←" was preferred but many keyboards (card punch machines) did not have the "←" so the used the ":=" as a simplified (more commonly accessible) version of "←"
[+] robomartin|6 years ago|reply
What to say, imagine my surprise seeing this on the first page of HN last night. Got my 15 minutes, I guess.

As is often the case, the responses contain a bunch of hateful stuff as well as people who actually try to engage with the ideas presented in a constructive manner. To state the obvious: I am not changing anything, so chill people.

Now to address some of the responses, starting with the best:

> I wanted to just drop a comment that I appreciate the time you took into this write up with code examples. It made understanding your POV very easy.

Thank you. Some of the comments lead me to believe we might have created a society where everything needs to be done in 140 characters. I am old school, arguments needs to be presented with enough background information for the reader to understand the perspective that led to the proposal or conclusion. Without that we are all just screaming things and waiting to see what sticks.

> APL is an almost perfect textbook example of write-only code.

Sure, just like math and musical notation are write-only. The only people who say stuff like this are those who don't know a thing about APL. There are two major APL compilers still on the market, the licenses for these cost in excess of $2,500 per year. If APL was "write-only" these two companies (IBM and Dyalog) would have shelved these products decades ago. The fact is that, while not mainstream (and for good reasons), APL is still in use and large codebases are being maintained and expanded. Historically this has mostly been in the financial sector.

Point of interest: My last large APL application was for a hardware/software system used during the race to decode the human genome. Among other things, my application interfaced with the hardware and was used to search for genomes in the sequences being produced. It also produced reports, graphs, managed the database, etc. The hardware portion was coded in Forth.

> APL's assignment operator is very strange to me

Thanks for a very detailed comment with code examples. I think I can help clarify the basic ideas.

The first thing to do is realize the assignment operator isn't passive but rather an active component of a statement being evaluated from right to left. That last part is important, APL evaluates from right to left, the only precedence rule being parenthesis are evaluated first. One could say indexing is also an exception but indexing "variable[23]" is a single monolithic expression that isn't separable, so it is evaluated as one item, not two.

So, if A and B are defined as vectors (APL's name for one dimensional arrays), each consisting of a few random values between 1 and 10:

    a ← 5?10  
    b ← 5?10
    a
    1 10 3 4 9 
    b
    9 5 6 1 4 
We can, for example, concatenate them like this:

    a,b
    1 10 3 4 9 9 5 6 1 4
Again, right-to-left evaluation says "take b, set it up to concatenate to something, ah, we are concatenating to a, another vector".

However, we can modify this process by giving the interpreter further instructions within the assignment. The simplest one is where we select, or index, the vectors and only grab specific elements to concatenate. I'll define a pair of new vectors to make this easier to see:

    c ← "ABCDE"
    d ← "abcde"
    c
    ABCDE
    d
    abcde
With this:

    c,d[1]
    ABCDEa

    c,d[2]
    ABCDEb

    c,d[2 5]
    ABCDEbe

    c[2 5],d
    BEabcde

    c[1],d 
    Aabcde

    c[2],d
    Babcde
The first few are the obvious cases, where we pull elements out of the right side vector and concatenate them to the left side vector. In the other cases we take the entire right side vector and concatenated to a subset of elements of the left side vector. I'd say this forms the basis for understanding the expressions you presented.

The above statements can be combined with assignment in order to do the work more efficiently, plain concatenation being the simplest case:

    e ← "mnlop"
    f ← "MNLOP"
    e
    mnlop
    f
    MNLOP

    e,←f
    e
    mnlopMNLOP
Again, right-to-left, we have vector "f", it will be concatenated to something, the assignment is a concatenation, it is concatenated and assigned to vector "e".

In other words, one can specify operations as part of the assignment. Because evaluation is from right to left, it stands to reason that the add-on operation has to be specified after you told the interpreter you are going to assign. This might be strange from the context of other paradigms but it makes complete sense once you grok the simple "everything is right-to-left" idea. If I were to propose how to read something like ",←" I would say "assign with concatenation"; "+←" would be "assign with addition".

We can do even more. It's easier to show this with multidimensional arrays:

    g ← 3 4 ⍴ "ABCDEFGHIJKL"
    h ← 2 4 ⍴ "abcdefgh"
    i ← 3 3 ⍴ "123456789"

      g
    ABCD
    EFGH
    IJKL

      h
    abcd
    efgh

      i
    123
    456
    789
The first statement reads something like this: Take the character vector "ABCDEFGHIJKL" we are going to reshape it "⍴" using a two dimensional numeric vector "3 4", thereby creating a 3x4 matrix and assign it to "g".

Note that "i" contains characters, not numbers, so if I try to add 1 to every value I get an error:

    i+1   
   DOMAIN ERROR
Now I can do a few things, for example:

Concatenate g and h along the first axis (they both have four columns):

      g,[1]h
    ABCD
    EFGH
    IJKL
    abcd
    efgh
Of course, all I have to do is add the assignment operator to change g:

      g,[1]←h

      g
    ABCD
    EFGH
    IJKL
    abcd
    efgh
As you can see, this is a natural extension of the syntax and the order of evaluation.

We can concatenate along the second axis if we use g and i:

      g,[2]i
    ABCD123
    EFGH456
    IJKL789
Assignment works just as well in this case:

      g,[2]←i

      g
    ABCD123
    EFGH456
    IJKL789
This syntax does have limitations. For example, what if I actually wanted to end-up with this?

    AEIae
    BFJbf
    CGKcg
    DHLdh
Well, we transpose the matrices during the concatenation, this works:

      (⍉g),[2]⍉h
    AEIae
    BFJbf
    CGKcg
    DHLdh
This does not work:

      (⍉g),[2]⍉←h
    SYNTAX ERROR
I don't recall why this is so. I think it has to do with starting to gobble-up piles of memory to keep intermediate results or something like that. You also end-up executing transpose (⍉) twice, which is inefficient. I don't remember.

This is a case where you have to resort to the simpler case:

      ⍉g,[1]h
    AEIae
    BFJbf
    CGKcg
    DHLdh
With assignment:

      g ← ⍉g,[1]h
      g
    AEIae
    BFJbf
    CGKcg
    DHLdh
In other words, concatenate along the first axis, transpose and then assign.

I got a little off track here with respect to the idea of using a different assignment symbol for Python but I thought your comment and implied question was one of the most interesting ones in the thread, which made it fun to address. I hope this helps. Here's a very useful resource:

https://www.dyalog.com/uploads/documents/MasteringDyalogAPL....

Ah, I do have to address one more:

> That's not on my keyboard.

Really people? It's 2019. I was typing APL characters back in the 1980's on DEC and Tektronix terminals and the IBM PC. Imagine the audacity of proposing that one can actually enter things other than ASCII or what's on the keyboard in 2019.

On the free NARS2000 interpreter you enter "←" very simply with "Alt [". In fact most APL characters are entered this way. The reshape operator, rho, "⍴", is entered using "Alt r". Not that complicated. I am sure this is well within the brain capacity of most mortals, it was so over 30 years ago.

Final thought: One of the interesting things about using APL long enough to internalize it (something like what happens when you don't think about reading musical notation any more and just see the music) is that you start thinking and seeing objects and their manipulation in your brain.

The analogy that comes to mind is what I do when I run SolidWorks. I can extrude, cut, scale, rotate, cut holes, create solids from rotated profiles, slice, animate and manipulate complex assemblies of three dimensional objects visually and with ease. The --imperfect-- analogy to conventional programming would be something like AutoCAD, which I have been using since version 1.0 in the dark ages. It's a flat 2D world. Yes, it can do 3D but the cognitive load and lack of tools makes it a terrible tool for working with three dimensional solid objects and assemblies.

Live long and prosper.

[+] anthony_doan|6 years ago|reply
R have <- operator and a quick google search Python currently isn't using <-.

So being pragmatic I would choose <- over ←.

[+] kelnos|6 years ago|reply
Not going to work:

    >>> x = -3
    >>> if x <- 2:
    ...     print 'yes'
    ... 
    yes
    >>> x = 5
    >>> if x <- 2:
    ...     print 'yes'
    ... 
    >>>
Yes, those are ugly examples of bad use of whitespace, but making '<-' mean assignment would be a backwards-incompatible change.

(In case it isn't clear, that evaluates as "if x is less than negative two".)

[+] goodside|6 years ago|reply
I’m annoyed you buried such a simple suggestion after so many examples and pointless tangents. The reason this isn’t done should be obvious to you: your proposed operator isn’t ASCII, unlike everything else in the syntax. (I have to call it “your proposed operator” because I don’t know or care how to type it on my iPhone — so it’s already causing problems.) The nearest ASCII rendering, `<-` as used in R, is syntactically ambiguous. You should have started this post with your actual proposal, instead of prefacing it with pompous language and details.
[+] Sean1708|6 years ago|reply
Also from what I can tell their argument is actually about replacing all uses of `=` with something else, and there's no reason the something else couldn't be `:=` (which has plenty of prior art, even if it has fallen out of favour recently).
[+] b0rsuk|6 years ago|reply
FYI, it's called Bottom Line Up First (BLUF), and comes from military.
[+] geofft|6 years ago|reply

[deleted]

[+] rbanffy|6 years ago|reply
Python and APL are almost opposite extremes of the readability spectrum. APL is an almost perfect textbook example of write-only code.

I would be extremely weary about borrowing syntax from APL or J.

[+] stuaxo|6 years ago|reply
Great, but I have no way to type that on my keyboard, is it <- ?