SQL shows it's age by having exactly the same problem.
Queries should start by the `FROM` clause, that way which entities are involved can be quickly resolved and a smart editor can aid you in writing a sensible query faster.
The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Never mind me, I'm just an old man with a grudge, I'll go back to my cave...
Yes, C#'s DSL that compiles to SQL (LINQ-to-SQL) does the same thing, `from` before the other clauses, for the same reason that it allows the IDE code completion to offer fields while typing the other clauses.
I do agree, that is about time that SQL could have a variant starting with FROM, and it shouldn't be that hard to support that, it feels like unwillingness to improve the experience.
This was a historical decision because SQL is a declarative language.
I was confused for too long, I want to admit, about the SQL order:
FROM/JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
As a self-taught developer, I didn't know what I was missing, but now the mechanics seem clear, and if somebody really needs to handle SELECT with given names, then he should probably use CTE:
WITH src AS (SELECT * FROM sales),
proj AS (SELECT customer_id, total_price AS total FROM src),
filt AS (SELECT * FROM proj WHERE total > 100)
SELECT * FROM filt;
It's true. I call authoring SQL "holographic programming" because any change I want to make frequently implies a change to the top and bottom of the query too; there's almost never a localized change.
C++ has this issue too due to the split between header declarations and implementations. Change a function name? You're updating it in the implementation file, and the header file, and then you can start wondering if there are callers that need to be updated also. Then you add in templates and the situation becomes even more fun (does this code live in a .cc file? An .h file? Oh, is your firm one of the ones that does .hh files and/or .hpp files also? Have fun with that).
> The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
Internally, most SQL engines actually process the clauses in the order FROM -> WHERE -> SELECT. This is why column aliases (defined in SELECT) work in the GROUP BY, HAVING and ORDER BY clauses, but not in the WHERE clause.
I've always liked the `select...from` order because it helps me understand the goal before reading the logic. In other words, I want to end up with this, and here's how I want to go about getting it.
My main caveat here, is that often the person starting a select knows what they want to select before they know where to select it from. To that end, having autocomplete for the sources of columns is far far more useful than autocomplete for columns from a source.
I will also hazard a guess that the total number of columns most people would need autocomplete for are rather limited? Such that you can almost certainly just tab complete for all columns, if that is what you really want/need. The few of us that are working with large databases probably have a set of views that should encompass most of what we would reasonably be able to get from the database in a query.
Don’t know why python gets so much love. It’s a painful language as soon as more than one person is involved. What the author describes is just the tip of the iceberg
The same reason people are not flocking to the Lisps of the world: mathematical rigour and precision does not translate to higher legibility and understandability.
Python's list /dict/set comprehensions are equivalent to typed for loops: where everyone complains about Python being lax with types, it's weird that one statement that guarantees a return type is now the target.
Yet most other languages don't have the "properly ordered" for loop, Rust included (it's not "from iter as var" there either).
It's even funnier when function calling in one language is compared to syntax in another (you can do function calling for everything in most languages, a la Lisp). Esp in the given example for Python: there is functools.map, after all.
This hasn't been my experience, but we use the Google style guide, linters, and static type verification to cut down on the number of options for how to write the program. Python has definitely strayed from its "one right way to do a thing" roots, but in the set of languages I use regularly it gives about the same amount of issues as JavaScript (and far, far less than C++) regarding having to deal with quirks that vary from user to user.
Failure to understand something is not a virtue. That it does get a lot of love strongly suggests that there are reasons for that. Of course it has flaws, but that alone doesn't tell us anything; only comparisons do. Create a comprehensive pro/con list and see how it fares. Then compare that to pro/con lists for other languages.
I used to agree with this completely, but type annotations & checking have made it much more reasonable. I still wouldn't choose it for a large project, but types have made it much, much easier to work with others' python code.
Python with strict type checking and its huge stdlib is my favourite scripting language now.
It's the exceptional codebase that's nice to work with when it gets large and has many contributors. Most won't succeed no matter the language. Language is a factor, but I believe a more important factor is caring a lot.
I'm working on a python codebase for 15 years in a row that's nearing 1 million lines of code. Each year with it is better than the last, to the extent that it's painful to write code in a fresh project without all the libraries and dev tools.
Your experience with Python is valid and I've heard it echoed enough times, and I'd believe it in any language, but my experience encourages me to recommend it. The advice I'd give is to care a lot, review code, and keep investing in improvements and dev tools. Git pre commit hooks (just on changed modules) with ruff, pylint, pyright, isort, unit test execution help a lot for keeping quality up and saving time in code review.
I love python, as long as were are talking about small teams, short, and short lived programs. The lack of static types makes things quick to implement even if it isn't too sound, and the strong typing keeps you from fucking up too badly. That is why I think it gets so much love in data science, it is great for noodling around while you are trying to figure something out.
On the other hand if you are going to be building something that is going to be long lived, with multiple different teams supporting it over time, and\or larger programs where it all doesn't fit in (human) memory, well then python is going to bite you in the ass.
There isn't a one size fits all programming language, you need at least two. A "soft" language that stays out of your way and lets you figure things out, and a "hard" language that forces the details to be right for long term stability and support.
Because everything that tries to fix it is just as painful in different ways.
I've had the displeasure of working in codebases using the style of programming op says is great. It's pretty neat. Until you get a chain 40 deep and you have to debug it. You either have to use language features, like show in pyspark, which don't scale when you need to trace a dozen transformations, or you get back to imperative style loops so you can log what's happening where.
Python was established as a fun and sensible language that was usable and batteries-included at a time when everything else was either expensive or excruciating, and has been coasting in that success ever since. If you'd only coded in bash, C/C++, and late-'90s Java, Python was a revelation.
Lists comprehensions were added to the language after it was already established and popular and imho was the first sign that the emperor might be naked.
Python 3 was the death of it, imho, since it showed that improving the language was just too difficult.
That would be nice if devs always wrote code sequentially, i.e. left to right, one character at a time, one line at a time. But the reality is that we often jump around, filling in some things while leaving other things unfinished until we get back to them. Sometimes I'll write code that operates on a variable, then a minute later go back and declare that variable (perhaps assigning it a test value).
Some IDEs provide code templates, where you type some abbreviation that expands into a corresponding code construct with placeholders, followed by having you fill out the placeholders (jumping from one to the next with Tab). The important part here is that the placeholders’ tab order doesn’t need to be from left to right, so in TFA’s example you could have an order like
{3} for {2} in {1}
which would give you code completion for {3} based on the {1} and {2} that would be filled in first.
There is generally a trade-off between syntax that is nice to read vs. nice to type, and I’m a fan of having nice-to-read syntax out of the box (i.e. not requiring tool support) at the cost of having to use tooling to also make it nice to type.
This is not meant as an argument for the above for-in syntax, but as an argument that left-to-right typing isn’t a strict necessity.
The consensus here seems to be that Python is missing a pipe operator. That was one of the things I quickly learned to appreciate when transitioning from Mathematica to R. It makes writing data science code, where the data are transformed by a series of different steps, so much more readable and intuitive.
I know that Python is used for many more things than just data science, so I'd love to hear if in these other contexts, a pipe would also make sense. Just trying to understand why the pipe hasn't made it into Python already.
This is almost FP vs OOP religious war in disguise. Similar to vim-vs-emacs ... where op comes first in vim but selection comes first in emacs.
If you design something to "read like English", you'll likely get verb-first structure - as embodied in Lisp/Scheme. Other languages like German, Tamil use verbs at the end, which aligns well with OOP-like "noun first" syntax. (It is "water drink" word for word in Tamil but "drink water" in English.) So Forth reads better than Scheme if you tend to verbalize in Tamil. Perhaps why I feel comfy using vim than emacs.
Neither is particularly better or worse than the other and tools can be built appropriately. More so with language models these days.
On the other hand, Python does have "from some_library import child_module" which is always nice. In JS we get "import { asYetUnknownModule } from SomeLibrary" which is considerably less helpful.
It seems to me what the author desires is linguistic support for the Thrush combinator[0]. Another colloquial name for it is "the pipe operator."
Essentially, what this combinator does is allow expressing a nested invocation such as:
f(g(h(x)))
To be instead:
h(x) |> g |> f
For languages which support defining infix operators.
EDIT:
For languages which do not support defining infix operators, there is often a functor method named `andThen` which serves the same purpose. For example:
I miss the F# pipe operator (https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...) in other languages. It's so natural to think of function transform pipelines. In other languages you have to keep going to the left and prepend function names, and to the right to add additional args, parens etc ...
A minor corollary to this is that, as the user types, IDEs should predictably try to make programs valid – e.g. via structured editing, balancing parens in Lisp like paredit, etc.
Author picked up a quite convenient example to show methods/lambda superiority.
I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods (which could not exists for all collections, PHP and JS are especially bad with this) and easily extendable to nested loops.
Yes it could be `[for line in text.splitlines() if line: for word in line.split(): word.upper()]`. But it is what it is. BTW I bet rust variant would be quite elaborate.
Disagree. The first example the author seem to want something more like imperative programming, so the "loop" construct would come first. But then the assignment should come last. With the python syntax you get the thing you're assigning first - near the equals sign - and then where it is selected from and with any filtering criteria. It makes perfect sense. If you disagree that's fine, the whole post is an opinion piece.
Not possible. There are more keystrokes that result in invalid programs (you are still writing the code!!) than keystrokes that result in a valid program.
More seriously, I do think that one consideration is that code is read more often that written, so fluidity in reading and comprehension seem more important to me than “a program should be valid after each keystroke.
> While the Python code in the previous example is still readable, it gets worse as the complexity of the logic increases.
This bit is an aside in the article but I agree so much! List comprehensions in python are great for the simple and awful for the complex. I love map/reduce/filter because they can scale up in complexity without becoming an unreadable mess!
I agree with the main ideas of this article. Context-first left-to-right seems like it would be easier for LLMs to write and autocomplete well too.
This line, though, seems like it's using the wrong tools for the job:
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
To me it's crying out for the lines to be NumPy arrays:
sum(1 for line in diffs
if ((np.abs(line) >= 1) & (np.abs(line) <= 3)).all()
and ((line > 0).all() or (line < 0).all()))
There's no need to construct the list in memory if you're just counting, and dealing with whole lines at once is much nicer than going element by element. On top of that, this version is much more left-to-right.
While methods partially solve this problem, they cannot be used if you are not the author of the type. Languages with uniform function call syntax like Nim or D do this better.
In 1990s-born scripting languages, it makes sense that there are plenty design choices that don't mesh well with static-analysis-driven autocompletion, because that was not at all part of the requirements for these languages at the time they were designed!
[+] [-] juancn|7 months ago|reply
Queries should start by the `FROM` clause, that way which entities are involved can be quickly resolved and a smart editor can aid you in writing a sensible query faster.
The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Never mind me, I'm just an old man with a grudge, I'll go back to my cave...
[+] [-] dleeftink|7 months ago|reply
Check out the DuckDB community extensions:
[0]: https://duckdb.org/community_extensions/extensions/psql.html
[1]: https://duckdb.org/community_extensions/extensions/prql.html
[+] [-] sgarland|7 months ago|reply
>The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
Per the SQL standard, you can't use column aliases in WHERE clauses, because the selection (again, relational algebra) occurs before the projection.
> You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Tbf, in MySQL 8 you can use `TABLE <table>`, which is an alias for `SELECT * FROM <table>`.
[+] [-] Arnavion|7 months ago|reply
[+] [-] pjmlp|7 months ago|reply
https://learn.microsoft.com/en-us/kusto/query/?view=microsof...
Also the LINQ approach in .NET.
I do agree, that is about time that SQL could have a variant starting with FROM, and it shouldn't be that hard to support that, it feels like unwillingness to improve the experience.
[+] [-] lackoftactics|7 months ago|reply
As a self-taught developer, I didn't know what I was missing, but now the mechanics seem clear, and if somebody really needs to handle SELECT with given names, then he should probably use CTE:
WITH src AS (SELECT * FROM sales), proj AS (SELECT customer_id, total_price AS total FROM src), filt AS (SELECT * FROM proj WHERE total > 100) SELECT * FROM filt;
[+] [-] shadowgovt|7 months ago|reply
C++ has this issue too due to the split between header declarations and implementations. Change a function name? You're updating it in the implementation file, and the header file, and then you can start wondering if there are callers that need to be updated also. Then you add in templates and the situation becomes even more fun (does this code live in a .cc file? An .h file? Oh, is your firm one of the ones that does .hh files and/or .hpp files also? Have fun with that).
[+] [-] dotancohen|7 months ago|reply
[+] [-] marcosdumay|7 months ago|reply
FROM table -- equivalent to today's select * from table
SELECT a, 1 as b, c, d -- equivalent to select ... from table
WHERE a in (1, 2, 3) -- the above with the where
GROUP BY c -- the above with the group by
WHERE sum(d) > 100 -- the above with having sum(d) > 100
SELECT count(a distinct) qt_a, sum(b) as count, sum(d) total_d -- the above being a sub-query this selects from
[+] [-] pzmarzly|7 months ago|reply
[+] [-] layer8|7 months ago|reply
[+] [-] benhurmarcel|7 months ago|reply
[+] [-] ARandomerDude|7 months ago|reply
[+] [-] de6u99er|7 months ago|reply
I usually start with: ``` select * from <table> as <alias> limit 5 ```
[+] [-] 0x696C6961|7 months ago|reply
[+] [-] taeric|7 months ago|reply
I will also hazard a guess that the total number of columns most people would need autocomplete for are rather limited? Such that you can almost certainly just tab complete for all columns, if that is what you really want/need. The few of us that are working with large databases probably have a set of views that should encompass most of what we would reasonably be able to get from the database in a query.
[+] [-] cultofmetatron|7 months ago|reply
[+] [-] danielPort9|7 months ago|reply
[+] [-] necovek|7 months ago|reply
Python's list /dict/set comprehensions are equivalent to typed for loops: where everyone complains about Python being lax with types, it's weird that one statement that guarantees a return type is now the target.
Yet most other languages don't have the "properly ordered" for loop, Rust included (it's not "from iter as var" there either).
It's even funnier when function calling in one language is compared to syntax in another (you can do function calling for everything in most languages, a la Lisp). Esp in the given example for Python: there is functools.map, after all.
[+] [-] shadowgovt|7 months ago|reply
[+] [-] jibal|7 months ago|reply
[+] [-] mb7733|7 months ago|reply
Python with strict type checking and its huge stdlib is my favourite scripting language now.
[+] [-] williamscales|7 months ago|reply
For a language where there is supposed to be only one way to do things, there are an awful lot of ways to do things.
Don’t get me wrong, writing a list comprehension can be very satisfying and golf-y But if there should be one way to do things, they do not belong.
[+] [-] SkepticalWhale|7 months ago|reply
[+] [-] smilliken|7 months ago|reply
I'm working on a python codebase for 15 years in a row that's nearing 1 million lines of code. Each year with it is better than the last, to the extent that it's painful to write code in a fresh project without all the libraries and dev tools.
Your experience with Python is valid and I've heard it echoed enough times, and I'd believe it in any language, but my experience encourages me to recommend it. The advice I'd give is to care a lot, review code, and keep investing in improvements and dev tools. Git pre commit hooks (just on changed modules) with ruff, pylint, pyright, isort, unit test execution help a lot for keeping quality up and saving time in code review.
[+] [-] stonemetal12|7 months ago|reply
On the other hand if you are going to be building something that is going to be long lived, with multiple different teams supporting it over time, and\or larger programs where it all doesn't fit in (human) memory, well then python is going to bite you in the ass.
There isn't a one size fits all programming language, you need at least two. A "soft" language that stays out of your way and lets you figure things out, and a "hard" language that forces the details to be right for long term stability and support.
[+] [-] jbs789|7 months ago|reply
[+] [-] noosphr|7 months ago|reply
I've had the displeasure of working in codebases using the style of programming op says is great. It's pretty neat. Until you get a chain 40 deep and you have to debug it. You either have to use language features, like show in pyspark, which don't scale when you need to trace a dozen transformations, or you get back to imperative style loops so you can log what's happening where.
[+] [-] mrheosuper|7 months ago|reply
[+] [-] Pxtl|7 months ago|reply
Lists comprehensions were added to the language after it was already established and popular and imho was the first sign that the emperor might be naked.
Python 3 was the death of it, imho, since it showed that improving the language was just too difficult.
[+] [-] ivanjermakov|7 months ago|reply
[+] [-] xigoi|7 months ago|reply
[+] [-] kmoser|7 months ago|reply
That would be nice if devs always wrote code sequentially, i.e. left to right, one character at a time, one line at a time. But the reality is that we often jump around, filling in some things while leaving other things unfinished until we get back to them. Sometimes I'll write code that operates on a variable, then a minute later go back and declare that variable (perhaps assigning it a test value).
[+] [-] layer8|7 months ago|reply
There is generally a trade-off between syntax that is nice to read vs. nice to type, and I’m a fan of having nice-to-read syntax out of the box (i.e. not requiring tool support) at the cost of having to use tooling to also make it nice to type.
This is not meant as an argument for the above for-in syntax, but as an argument that left-to-right typing isn’t a strict necessity.
[+] [-] aquafox|7 months ago|reply
I know that Python is used for many more things than just data science, so I'd love to hear if in these other contexts, a pipe would also make sense. Just trying to understand why the pipe hasn't made it into Python already.
[+] [-] sriku|7 months ago|reply
If you design something to "read like English", you'll likely get verb-first structure - as embodied in Lisp/Scheme. Other languages like German, Tamil use verbs at the end, which aligns well with OOP-like "noun first" syntax. (It is "water drink" word for word in Tamil but "drink water" in English.) So Forth reads better than Scheme if you tend to verbalize in Tamil. Perhaps why I feel comfy using vim than emacs.
Neither is particularly better or worse than the other and tools can be built appropriately. More so with language models these days.
[+] [-] Zarathruster|7 months ago|reply
[+] [-] AdieuToLogic|7 months ago|reply
Essentially, what this combinator does is allow expressing a nested invocation such as:
To be instead: For languages which support defining infix operators.EDIT:
For languages which do not support defining infix operators, there is often a functor method named `andThen` which serves the same purpose. For example:
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush[+] [-] 0xfffafaCrash|7 months ago|reply
https://github.com/tc39/proposal-pipeline-operator
It would make it possible to have far more code written in the way you’d want to write it
[+] [-] zzbzq|7 months ago|reply
I've seen some SQL-derived things that let you switch it. They should all let you switch it.
[+] [-] hawk_|7 months ago|reply
[+] [-] nxobject|7 months ago|reply
[+] [-] haunter|7 months ago|reply
https://en.wikipedia.org/wiki/Non-English-based_programming_...
[+] [-] bvrmn|7 months ago|reply
I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods (which could not exists for all collections, PHP and JS are especially bad with this) and easily extendable to nested loops.
Yes it could be `[for line in text.splitlines() if line: for word in line.split(): word.upper()]`. But it is what it is. BTW I bet rust variant would be quite elaborate.
[+] [-] phkahler|7 months ago|reply
[+] [-] andsoitis|7 months ago|reply
Not possible. There are more keystrokes that result in invalid programs (you are still writing the code!!) than keystrokes that result in a valid program.
More seriously, I do think that one consideration is that code is read more often that written, so fluidity in reading and comprehension seem more important to me than “a program should be valid after each keystroke.
[+] [-] benrutter|7 months ago|reply
This bit is an aside in the article but I agree so much! List comprehensions in python are great for the simple and awful for the complex. I love map/reduce/filter because they can scale up in complexity without becoming an unreadable mess!
[+] [-] mkl|7 months ago|reply
This line, though, seems like it's using the wrong tools for the job:
To me it's crying out for the lines to be NumPy arrays: There's no need to construct the list in memory if you're just counting, and dealing with whole lines at once is much nicer than going element by element. On top of that, this version is much more left-to-right.[+] [-] xigoi|7 months ago|reply
[+] [-] frou_dh|7 months ago|reply