top | item 39799714

(no title)

bkanuka | 1 year ago

As someone who learned mathematics first and programming later, I think it took me about 10 years of working in data-intensive programming before I could write really "good" SQL from scratch.

I completely attribute this to SQL being difficult or "backwards" to parse. I mean backwards in the way that in SQL you start with what you want first (the SELECT) rather than what you have and widdling it down. Also in SQL (as the author states) you often need to read and understand the structure of the database before you can be 100% sure what the query is doing. SQL is very difficult to parse into a consistent symbolic language.

The turning point for me was to just accept SQL for what it is. It feels overly flexible in some areas (and then comparatively ridgid in other areas), but instead of fighting against this or trying to understand it as a consistent, precise language , I instead just go "oh SQL - you are not like the other programming languages I use but you can do some pretty neat stuff so we can be on good terms".

Writing good SQL involves understanding the database, understanding exactly the end result you want, and only then constructing the subqueries or building blocks you need to get to your result. (then followed by some trial and error of course)

discuss

Winsaucerer|1 year ago

I feel like a foreigner in another land when I read your comment and others like it. For as long as I can remember using SQL, I can't remember ever finding it more difficult or backwards than anything else I use.

That difference might go some way towards explaining why I prefer a much more database heavy/thick approach to writing apps than my peers.

sodapopcan|1 year ago

I agree. I never even thought about "select what you want first" as a problem until someone else pointed out.

Programmers seem far too sensitive about wanting everything to work one way. SQL is a very powerful DSL. It has its quirks but nothing that ever enraged me. I don't really care that it doesn't work like some other stuff I use, I just accept that I'm learning the language of a particular domain. This doesn't mean that I don't think there is always room for improvement. Of course I think FROM first would be a little nicer, but so much nicer that I think its worth changing a whole battle-tested standard? Not at all. The pain is so minimal I don't even feel it.

naasking|1 year ago

> I feel like a foreigner in another land when I read your comment and others like it. For as long as I can remember using SQL, I can't remember ever finding it more difficult or backwards than anything else I use

Learn linq or query/list comprehensions and then you'll easily see why SQL is backwards.

htag|1 year ago

I learned SQL before I learned set theory. While learning set theory I remember thinking "oh this notation is just SQL backwards." Afterwards I began to find SQL much harder because I realized there are so many ways to mathematically ask for the same data, but SQL servers will computationally arrive at the end differently and with very different performance. This is a minor deal if you're just doing small transactions on the database, because if you are dealing with pages of 100 objects it's trivial to hit good-enough performance benchmarks, even with a few joins.

I was first introduced to the issue of needing hyper optimized SQL in ETL type tasks, dealing with very large relational databases. The company switched to non-relational database shortly after I left, and it was the first time I professional witness someone make the switch and agreed that it was obviously required for them. We were dealing with very large batch operations every night, and our fortune 500 customers expected to have the newest data and to be able to do Business Intelligence operations on the data every morning. After acquiring bigger and bigger customers, and collecting longer and longer histories of data, our DBA team had exhausted every trick to get maximum performance from SQL. I was writing BI sql scripts against this large pool of SQL data to white-glove some high value customers, and constantly had to ask people for help optimizing the sql. I did this for a year at the beginning of my career, before deciding to move cities for better opportunities.

Lately, I've began seeing the requirements of high performance SQL again with the wave of microservice architectures. The internal dependency chain, even of what would have been a mid size monolith project a decade ago, can be huge. If your upstream sets a KBI of a response time, it's likely you'll get asked to reduce your response time if your microservice takes up more than a few percentage points of the total end to end time. Often, if you are using relational SQL with an ORM you can find performance increases in your slowest queries by hand writing the SQL. Many ORMs have a really good library for generating sql queries they expose to users, but almost all ORMs will allow you to write a direct sql query or call a stored procedure. The trick to getting performance gains is to capture the SQL your ORM is generating and show it to the best sql expert that will agree to help you. If they can write better SQL than the ORM generated than incorporate it into your app and have the SQL expert and a security expert on the PR. You might also need to do a SQL migration to modify indexes.

So in summary, I think your experiences with SQL depends heavily on your mathematical background and your professional experience. It's important to look at SQL as computational steps to reach your required data and not simply as a way to describe the data you would like the SQL server to give you.

mamcx|1 year ago

> I can't remember ever finding it more difficult or backwards than anything else I use."

This is the major problem. SQL looks like is not "difficult". You don't see (as a user) all their MASSIVE, HUGE, problems.

That is why:

- People barely do more than basic SQL

- People can't imagine SQL can be used for more than that, which leads to:

- Doing a lot of hacky, complex, unnecessary stuff on app code (despite the RDBMS being capable of it)

- Trying to layer something "better" in the forms of ORM

- Refusing to use advanced stuff like views, stored procedures, custom types, and the like

- Using of using advanced stuff like views, stored procedures, custom types, and the like, but wrongly

- Thinking that SQL means RDBMS

- So when the RDBMS fails, it is because the RDBMS is inferior. But in fact, is SQL that have failed (you bet the internals of the RDBMS are far more powerful than any NoSql engine, unfortunately, they are buried forever because SQL is a bad programming interface for the true potential of the engine!)

- So dropping SQL/RDBMS for something better, like JS (seriously?)

- And they are happier with their "cloud scale" NoSQL that rarely performs better, needs major, massive hacks for queries, or reimplements, poorly, ACID again, is more prone to data issues, etc.

And this is not even starting. If you think "is bad to make a full app, all their code, in relational model" that is how much brain damage SQL has caused.

---

I can count with my fingers the number of semi-proper DBs/SQL usage on my niche (ERPs) and that is mostly mine! (For example: I use dates for dates, not strings, like many of my peers!) and that is taking into account that I actually learned what the heck is that "relational" thingy after +20 years of professional use.

Go figure!

P.D: And then go to my code and see "what the heck, I could have done this in some few lines of SQL" and "what the heck, if only SQL were well designed I could do this dozen lines of SQL in 3!"

whatever1|1 year ago

The trial and error is the worst part.

In traditional languages, you can print iteration by iteration the intermediate result and understand if there is something wrong.

In SQL you sample output, and you keep changing the query until you think you get it right. And then 2 years later someone else finds that the query was wrong all this time.

jalk|1 year ago

Common Table Expressions (CTE) do help a little, as you can query each “table” and inspect the output. Debugging a giant query with deeply nested sub queries is very painful indeed

remus|1 year ago

> The trial and error is the worst part.

I don't know about anyone else, but I do this kinda naturally when writing SQL queries. Usually start with a base table, query the first 100 rows to see what the data looks like, start joining on other tables to get info I need, querying as I go to check join conditions, perhaps build out some CTEs if I need to do some more complex work, query those to check the format of the data ... And so on.

It doesn't feel that different to any other programming in that sense. Querying is printing.

fifilura|1 year ago

> you can print iteration by iteration the intermediate result

You would not be able to do that with a multi-threaded/multi-process application.

And this is the reason why e.g. Trino/Presto is so powerful together with SQL.

Instead of telling the computer how to go by to get your result, you tell it what result you want and let it do it in the best way.

The most up-front way of telling a computer "how" is a for-loop. And SQL does not have it. It may seem limiting, but avoiding explicit for loops gives the freedom to the computer. If it sees it fit to distribute that calculation over 200 distributed CPUs it can do that. With an imperative language you need to tell the computer exactly how it should distribute it. And from there it gets really hairy.

Ma8ee|1 year ago

Trial and error is usually a bad idea in all kinds of programming.

r00fus|1 year ago

I mean, I never build a query from front to back. Usually I build it FROM -> JOIN -> WHERE -> SELECT.

thaumasiotes|1 year ago

> widdling it down

Whittling. It means to carve something out of wood, with a metaphorical extension, as here, to gradually making something smaller by shaving bits of it away.

Strang|1 year ago

Important distinction. "Widdling" is urination.

mrits|1 year ago

I always thought writing SQL from scratch was the easy part. The hard part for me was coming back to my query a few weeks later

arrowsmith|1 year ago

This is true for most programming languages.

Winsaucerer|1 year ago

That's why I try (but sometimes forget) to extensively comment my queries that have any kind of complexity :)

mdcurran|1 year ago

This doesn’t totally solve the issue of SELECT’ing first then filtering, but for complex queries I’ve found CTEs very useful (whenever the database/SQL dialect supports it).

icedchai|1 year ago

What I usually do is start with "select *", get the joins and where clause down, then refine the select.

nextaccountic|1 year ago

> I completely attribute this to SQL being difficult or "backwards" to parse. I mean backwards in the way that in SQL you start with what you want first (the SELECT) rather than what you have and widdling it down.

> The turning point for me was to just accept SQL for what it is.

Or just write PRQL and compile it to SQL

https://github.com/PRQL/prql

392|1 year ago

You may like PRQL, which gives a more composable-atoms based approach. I find it far easier than SQL.

dmead|1 year ago

Saying what you want first rather than what you have is evidence of the von Neumann bottleneck or it was a sign of the times when SQL was being developed on 1970s machine.

Either way, point taken that it is not like a proof.

ako|1 year ago

Covey’s: “start with the end in mind” is not a bad advise when building something complex. With procedural languages you do the same, you first write the signature, parameters expected to go in and out, and then you start writing the way to achieve this.