top | item 38532256

(no title)

MarkusWinand | 2 years ago

No question, such a query should not be written. That's probably the reason why this odd behavior, which is even different in various DBMSes, is not causing everyday problems.

discuss

order

marcosdumay|2 years ago

The reason the behavior doesn't cause problems is because everybody treats automatic aggregation as a voodoo where they know one recipe that works and anything different is domain of the Devil.

And, IMO, that's a very sane and reasonable way to treat it. The entire idea of automatic aggregation is flawed, and those queries should just have a `group by ()` explicit at the right place.

foldU|2 years ago

I agree, I think the original sin here is the fact that whether a `SELECT` is an aggregation is determined by the contents of the scalar expressions at all. I think most of this weirdness comes directly out of wanting to be able to write both `SELECT sum(x) FROM xx` and `SELECT x FROM xx` and have them work.

Not that I have a better solution offhand, in SQL grouping by a constant value is not actually the same as not writing `GROUP BY` at all since the behaviour on empty tables is different.

indigo945|2 years ago

Would you argue that automatic scalar-ism is also flawed, and the query

    SELECT a FROM aa;
should have an explicit grouping, like

    SELECT a FROM aa GROUP BY id;
?

After all, when you think about it, it's really not the aggregate functions that break expectations, it's the scalars. Of the four combinations of having or not having an explicit `GROUP BY` and having or not having an aggregate function, three of them have the unsurprising behavior of returning exactly one row per grouping.

    -- aggregate function and group by - one single result per grouping
    SELECT sum(a) FROM aa GROUP BY a; --or
    SELECT sum(a) FROM aa GROUP BY ();


    -- no aggregate function, but a group by - one single result per grouping
    SELECT a FROM aa GROUP BY a;

    -- just aggregate function, but no group by - one single result per (single, implicit) grouping
    SELECT sum(a) FROM aa;

But then, when you have neither an aggregate function nor an explicit `GROUP BY`, it breaks expectations:

    -- no aggregate function, no group by - one result row per row 
    -- in the source set, even though there should be only one big implicit group
    SELECT a FROM aa;
Therefore, I propose that the next SQL standard should introduce a new `GROUP BY INDIVIDUAL ROW` keyword, that henceforth all "scalar" queries MUST use in order to have consistent behaviour with the rest of the language.