No question, such a query should not be written. That's probably the reason why this odd behavior, which is even different in various DBMSes, is not causing everyday problems.
The reason the behavior doesn't cause problems is because everybody treats automatic aggregation as a voodoo where they know one recipe that works and anything different is domain of the Devil.
And, IMO, that's a very sane and reasonable way to treat it. The entire idea of automatic aggregation is flawed, and those queries should just have a `group by ()` explicit at the right place.
I agree, I think the original sin here is the fact that whether a `SELECT` is an aggregation is determined by the contents of the scalar expressions at all. I think most of this weirdness comes directly out of wanting to be able to write both `SELECT sum(x) FROM xx` and `SELECT x FROM xx` and have them work.
Not that I have a better solution offhand, in SQL grouping by a constant value is not actually the same as not writing `GROUP BY` at all since the behaviour on empty tables is different.
Would you argue that automatic scalar-ism is also flawed, and the query
SELECT a FROM aa;
should have an explicit grouping, like
SELECT a FROM aa GROUP BY id;
?
After all, when you think about it, it's really not the aggregate functions that break expectations, it's the scalars. Of the four combinations of having or not having an explicit `GROUP BY` and having or not having an aggregate function, three of them have the unsurprising behavior of returning exactly one row per grouping.
-- aggregate function and group by - one single result per grouping
SELECT sum(a) FROM aa GROUP BY a; --or
SELECT sum(a) FROM aa GROUP BY ();
-- no aggregate function, but a group by - one single result per grouping
SELECT a FROM aa GROUP BY a;
-- just aggregate function, but no group by - one single result per (single, implicit) grouping
SELECT sum(a) FROM aa;
But then, when you have neither an aggregate function nor an explicit `GROUP BY`, it breaks expectations:
-- no aggregate function, no group by - one result row per row
-- in the source set, even though there should be only one big implicit group
SELECT a FROM aa;
Therefore, I propose that the next SQL standard should introduce a new `GROUP BY INDIVIDUAL ROW` keyword, that henceforth all "scalar" queries MUST use in order to have consistent behaviour with the rest of the language.
marcosdumay|2 years ago
And, IMO, that's a very sane and reasonable way to treat it. The entire idea of automatic aggregation is flawed, and those queries should just have a `group by ()` explicit at the right place.
foldU|2 years ago
Not that I have a better solution offhand, in SQL grouping by a constant value is not actually the same as not writing `GROUP BY` at all since the behaviour on empty tables is different.
indigo945|2 years ago
After all, when you think about it, it's really not the aggregate functions that break expectations, it's the scalars. Of the four combinations of having or not having an explicit `GROUP BY` and having or not having an aggregate function, three of them have the unsurprising behavior of returning exactly one row per grouping.
But then, when you have neither an aggregate function nor an explicit `GROUP BY`, it breaks expectations: Therefore, I propose that the next SQL standard should introduce a new `GROUP BY INDIVIDUAL ROW` keyword, that henceforth all "scalar" queries MUST use in order to have consistent behaviour with the rest of the language.