top | item 34580392

(no title)

philmcp | 3 years ago

I'd say SQL is the most underrated engineering skill out there.

It amazes me when competent developers / analysts / data scientists don't know (any) SQL. Have they stopped teaching this in University or something?

discuss

order

throwaway23597|3 years ago

IMO there's a big difference between university level SQL and the ability to competently write SQL in a professional setting. In university, at least at my school, you are getting maybe at most 3 tables overall, with 3-5 columns each. Professionally, you're dealing with hundreds, maybe thousands of tables, each with between 10-1000 columns, and usually not even in 1NF either. It really takes a lot of practice writing queries professionally to get to a level of competence here.

mritchie712|3 years ago

I think it's bc of how popular ORM's are. You can get pretty far with an ORM without knowing any SQL, but you can inadvertently do some really dumb things if you don't understand how SQL works.

randomdata|3 years ago

SQL was the first language I ever learned, so perhaps my perspective is skewed, but it seems like it would be pretty hard to use an ORM without gaining an intuition into how SQL works.

What is a trap I see often among ORM users is them running into the n+1 problem. However, that's actually a consequence of database implementations not being theoretically pure. If you had an ideal SQL database then their approach would be technically correct. They aren't wrong to think that way.

It's just that we don't have ideal SQL databases, so we have to resort to hacks to make things work in the real world. Why we need to break from the theoretical model and use those hacks can be difficult to understand if you aren't familiar with the implementation at a lower level.

cnity|3 years ago

ORMs are funny: people say that they are great for the simple stuff (scans with simple joins and filters). But those cases are just that: simple!

listenallyall|3 years ago

I don't think it's the syntax of SQL that causes problems (and which ORMs try to replace)... it's the complex underlying logic necessary to accurately join multiple tables, set where and having conditions, and aggregating records, and knowing to which dataset each should apply.

Countless "replacements" for SQL have come around to make data accessible by non-expert, but regardless if your using a BI tool, reporting platform, ORM, whatever, pretty soon you're going to need to understand and express the logic, and that's when you really appreciate SQL's directness and terseness.

whoopdeepoo|3 years ago

Every single org I've found that says "we don't use ORMs because they're slow" ends up constructing their own bespoke, wheel-reinventing "orm" or sticks all their application logic in sprocs, which are nightmares for imperative logic.

Kon-Peki|3 years ago

My memory could be a little hazy, but I don't remember any required course that dealt with SQL when I was in the CS program at a pretty highly-regarded university 25 years ago.

I took a course in which I learned quite a lot about SQL and in retrospect it was an extremely useful course to have taken.

icedchai|3 years ago

I remember taking a databases class (either junior or senior year) that covered database design and also SQL with Oracle. We even got into Pro*C, which was some crazy Oracle-specific C pre-processor. It definitely wasn't required. My roommate took it and failed.

Izkata|3 years ago

Mine ~15 years ago in the US had a required course that did some relational algebra and database normalization without ever touching a database (as a small part of a broader course, IIRC), and an elective where we did really simple stuff with a database (I came out of it not really understanding the difference between INNER JOIN and LEFT JOIN, for example).

edejong|3 years ago

At a Dutch CS study we had an SQL course that started with first-order logic, relational algebra and went to on to project that into SQL. It also taught 3NF/4NF and BCNF, indices, r-trees, query planning and optimisation.

SCdF|3 years ago

We did some in CS101/102 (first year required CS courses). Specifically normalised forms and the like.

robertlagrant|3 years ago

We had to do it in the UK, circa 20 years ago.

cmrdporcupine|3 years ago

Database education in CS programs in North America seems minimal. Most people I meet haven't taken any, or maybe just took a practical "how to query" course. In North America, it seems like only CMU has an active (excellent) program around this topic.

There's a woeful ignorance about what the relational data model is, how the industry arrived here, and how this is implemented. Problems or archaisms with SQL specifically become synonymous in people's heads with the relational model generally, and for a while that led down the quite problematic NoSQL road. Then slingshotted back to SQL -- but from my perspective SQL itself (not the relational model) is a problem. It doesn't compose well. It doesn't handle recursive relations well. It has an awkward syntax. It conflates concepts. It has an archaic datatype model. None of this is intrinsic to the relational model, but SQL becomes a limiting factor.

Disclaimer: I work for a DB company doing awesome stuff with the relational data model, but not SQL (RelationalAI) so am ... biased. Though I have always had those biases. :-)

michaelmior|3 years ago

I'm a faculty member at the Rochester Institute of Technology. We have multiple courses for both our undergrads and grad students in databases. I primarily teach Introduction to Big Data for our graduate students. We happen to cover all the things that you mention in your second paragraph :)

alephxyz|3 years ago

I've worked with many data scientists whose typical SQL usage was to load entire rows (or with simple 'with' filtering) into Pandas / R dataframes and then do all their work there. I think it's a combination of Pandas and R having much simpler APIs and a ton of documentation on Stack Overflow, and modern hardware just being so good that you can load a big enough chunk of your dataset in-memory on a laptop.

bigger_cheese|3 years ago

I mostly use SAS, I tend to prefer using plain sql queries where I typically depart SQL and jump into code is doing what SAS calls "By Group processing" (example https://support.sas.com/kb/26/013.html#)

I am not as familiar with R. Last time I worked in R (some years ago) equivalent R code was something like this caution I'm no expert in writing R so might be a better /more intuitive way...

Output_data <-merge(x=T1, y=T2, by="Date", all.x="True") %>% mutate(My_var = NAME) %>% fill(My_var)

In SQL the equivalent would need to use Over (Partition by) which is less intuitive for me to write.

o_nate|3 years ago

Hard for me to imagine anyone who finds Pandas API more intuitive than plain old SQL. I can't do anything in Pandas without looking up syntax.

LarryMullins|3 years ago

> Have they stopped teaching this in University or something?

It wasn't part of the mandatory coursework when I went to school. I really think it should be. It's not as though my CS program was 'pure theory', they taught lots of 'vocational' stuff but not SQL.

As for why people put off learning it.. I put off learning it longer than I care to admit because I thought I could simply do without it (ORMs, BerkeleyDB, etc), and fake it if put on the spot (I always understood the very basics, insofar as a SQL query can be understood if read as declarative English.) Ever since I bit the bullet and actually learned it properly, I've been kicking myself for not learning it upfront.

giraffe_lady|3 years ago

I think one of the main uncertainties in higher education the last couple decades is what balance, if any, they should strike between job training for software developers, coding as an engineering discipline, and preparation for an academic career in math.

Most programs include parts of all three, but SQL per se (rather than like, relational algrebra) is pretty firmly in the professional training for software developers set, and schools that reject that aspect may not teach it.

In my experience recent code school/boot camp grads have as much or more practical sql as recent CS grads; probably because those schools are nearly totally focused on professionally applicable skills.

cmrdporcupine|3 years ago

That may be likely, but then schools which emphasize the math/theory part are then missing the opportunity to really teach the theoretical part of the relational algebraic stuff; which is really set theory and first order logic. There's a lot of math-y stuff to dig into there.

It should be mandatory along with understanding O notation and datastructures and algorithms, because really... unless you're doing a career which is like... 100% embedded development... you're going to be encountering databases and datamodeling as part of your career.

I'd be much happier if CS programs would graduate people who did a whole semester of first order logic, Date&Codd's foundational papers and why network&hierarchical databases are problematic, relational algebra / calculus, Datalog & friends, and then just toss in a "and this is how SQL does some of this but also mangles all of this..." at the end.

+ as a bonus, a DB implementation/internals course so people can understand what goes into query execution etc.

Because those people would then have some proper context before going off and butchering the world with ORMs and microservices...

spaniard89277|3 years ago

In my bootcamp (Spain) SQL is not widely-teached but there's a course about it (with Postgre) and you see something from time to time.

But AFAIK so does my local university.

pc86|3 years ago

I'm not sure you can call those folks competent if they can't interact with a database unless you give them an ORM.

waffletower|3 years ago

The issue isn't usually competency, can vs. cannot. It is usually a matter of should or should not. I have seen databases thrown at problems where they decidedly should not have, and have seen the product development cycle suffer where it should not have. While some engineers and analysts relish CTE spaghetti -- solving complex logic utilizing enormous wads of SQL, others are more wary and rightly look to tackle complexity with more accountable and powerful tools when possible. I admit that tools like DBT can help address some of the shortcomings of SQL, but it still does not have the generality and conciseness of modern languages.

maerF0x0|3 years ago

Anecdata point here, personally my SQL skills have lapsed because I've been incentivized to know NoSQL more. Same for frontend stuff like JS/React btw.

cess11|3 years ago

My programme in high school allowed me to choose a course in SQL, that's how I first learned it in 2002.