Trying to clarify it a bit, it gets interpreted as: "[0xf or (x in (1,2,3))]" and the "or" short circuits, never evaluating the second part (which would have given an uninitialized variable error). This therefore evaluates to [0xf] or just [15] once you convert the hex notation to an integer.
If you neglect the outer list that's only there to fool you into thinking this has something to do with list comprehensions, same as the x, and if you don't use hex to make it seem like there is a "for" in the middle, it boils down to something like: "15or whatever()" which doesn't seem all that confusing, even if "whatever" uses an uninitialized variable like x, or is undefined, because we're in Python and it's only evaluated when it runs. Then we are left with the 'confusion' boiling down to a Wat about why it's legal to do "15or 3" without space before or, and especially why "x0for 3" works. This is documented as other comments mention and is due to the way the parsing works.
I don't know if the parser could be changed to require a leading space before the or operator, but it's pretty clear to me that this is only confusing if you intentionally do you very best to try to add confusing structure around it in an attempt to fool the reader into thinking something very different and weird is going on.
> it's pretty clear to me that this is only confusing if you intentionally do you very best to try to add confusing structure around it in an attempt to fool the reader into thinking something very different and weird is going on.
This has come up from a Core Developer before [1] so it's not just code golfers having a laugh.
Yes, the docs note the tokenization behavior [2], but Guido's response today in the above mail thread is also pretty unambiguous:
> I would totally make that a SyntaxError, and backwards compatibility be damned.
> it's pretty clear to me that this is only confusing if you intentionally do you very best to try to add confusing structure around it in an attempt to fool the reader into thinking something very different and weird is going on.
Sure, but that's half the fun :-) I frequently see Ned's name pop up with interesting things like this.
Yeah, this also evaluates to [15] and is easier to understand what's going on:
[15or x in (1, 2, 3)]
As another person pointed out, Python's lexer sees 0xfor (or 15or) and splits it into "0xf", "or" ("15", "or" for the other case) then parser processes it as usual.
> Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).
Any way to lint this pattern away? Say what you will about the language and lexer complexity, backwards compatibility. This is a code smell if I have ever seen one.
This illustrates why lexer tokens should preferably not be defined as exactly what the language allows. Instead the internal lexer definition should include invalid tokens (like “0xfor”) that are only sorted out in a later step (in this case, when actually converting it to an integer value).
I initially assumed the missing space was a typo in the headline, so I tried [0x for x in (1,2,3)] and got "invalid hexadecimal literal" as I would have expected. Took me a few minutes to realize the typo was intentional.
I still can't figure out what the author was trying to do when he stumbled onto that, though...
There's no ambiguity though, right? [0x for x in (1,2,3)] is nonsense (one typo off from something legitimate, but nonsense regardless). I can't think of another way someone might expect this to parse.
Looks like the parser knows numbers end when the digits end so splits them into two tokens. (Ignoring the optional [lL] long suffix before Python 3)
One of those things that looks helpful at first glance, but is problematic in the long run. Throwing a syntax error immediately would lead to more robust/maintainable code.
# launch debugger with the program provided inline, consisting of the statement `1`.
# this program is ignored and we just mess with some variables.
% perl -de1
Loading DB routines from perl5db.pl version 1.57
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
main::(-e:1): 1
DB<1> @a = (1..4)
DB<2> x \@a
0 ARRAY(0x7faf890418f8)
0 1
1 2
2 3
3 4
DB<3> x scalar @a
0 4
DB<4> sub foo { return (1..4) }
DB<5> @b = foo()
DB<6> x scalar @b
0 4
DB<7> x scalar foo()
0 ''
yes, that's the empty string, aka false, because the .. operator returns entirely different things in scalar context so you can write code like
while (<STDIN>) {
next if /^BEGIN$/ .. /^END$/;
# ...
}
[+] [-] jVinc|5 years ago|reply
If you neglect the outer list that's only there to fool you into thinking this has something to do with list comprehensions, same as the x, and if you don't use hex to make it seem like there is a "for" in the middle, it boils down to something like: "15or whatever()" which doesn't seem all that confusing, even if "whatever" uses an uninitialized variable like x, or is undefined, because we're in Python and it's only evaluated when it runs. Then we are left with the 'confusion' boiling down to a Wat about why it's legal to do "15or 3" without space before or, and especially why "x0for 3" works. This is documented as other comments mention and is due to the way the parsing works.
I don't know if the parser could be changed to require a leading space before the or operator, but it's pretty clear to me that this is only confusing if you intentionally do you very best to try to add confusing structure around it in an attempt to fool the reader into thinking something very different and weird is going on.
[+] [-] sco1|5 years ago|reply
This has come up from a Core Developer before [1] so it's not just code golfers having a laugh.
Yes, the docs note the tokenization behavior [2], but Guido's response today in the above mail thread is also pretty unambiguous:
> I would totally make that a SyntaxError, and backwards compatibility be damned.
1: https://mail.python.org/archives/list/[email protected]/...
2: https://docs.python.org/3/reference/lexical_analysis.html#wh...
[+] [-] d23|5 years ago|reply
Sure, but that's half the fun :-) I frequently see Ned's name pop up with interesting things like this.
[+] [-] takeda|5 years ago|reply
[+] [-] maccard|5 years ago|reply
[+] [-] css|5 years ago|reply
https://docs.python.org/3/reference/lexical_analysis.html#wh...
[+] [-] tannhaeuser|5 years ago|reply
[+] [-] emmelaich|5 years ago|reply
Reminds me of Fortran.
https://arstechnica.com/civis/viewtopic.php?t=862715
[+] [-] baruchel|5 years ago|reply
[+] [-] CameronNemo|5 years ago|reply
[+] [-] adpirz|5 years ago|reply
0xf = Hex for 15
Expression is evaluated as a boolean
[+] [-] wyldfire|5 years ago|reply
[+] [-] mjs7231|5 years ago|reply
* 0xfor1 evaluates.
* 1or 2 evaluates.
* 1or2 doesn't.
* ''or'foo' evaluates.
This is gross.
[+] [-] kristaps|5 years ago|reply
edit: 'cause it's a bug! https://bugs.python.org/issue43833
[+] [-] layer8|5 years ago|reply
[+] [-] nickysielicki|5 years ago|reply
[+] [-] tgv|5 years ago|reply
[+] [-] commandlinefan|5 years ago|reply
I still can't figure out what the author was trying to do when he stumbled onto that, though...
[+] [-] TrackerFF|5 years ago|reply
That is, 0or, 0xor, 0and,etc. in my view, this should clearly be a syntax error.
edit: as noted by the parent comment, 0xor != 0 xor, but 0x or. But this seems to hold true for all data types.
[+] [-] brundolf|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] CameronNemo|5 years ago|reply
[+] [-] mixmastamyk|5 years ago|reply
One of those things that looks helpful at first glance, but is problematic in the long run. Throwing a syntax error immediately would lead to more robust/maintainable code.
[+] [-] dagss|5 years ago|reply
[+] [-] Nokinside|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] obi1kenobi|5 years ago|reply
[+] [-] Sohcahtoa82|5 years ago|reply
https://docs.python.org/3/reference/lexical_analysis.html#wh...
[+] [-] noobermin|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] fennecfoxen|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] xenonite|5 years ago|reply
`11-0xbor-11` is -11
and `11-0xbor-0xbor-0xbor-11` is -11, too
[+] [-] 9front|5 years ago|reply
[+] [-] noobermin|5 years ago|reply
[+] [-] bjornorn|5 years ago|reply
[+] [-] phsilva|5 years ago|reply
15