Python: Just Write SQL

zzzeek|2 years ago

ORMs do much more than "write SQL". This is about 40% of the value they add.

As this argument comes up over, and over, and over, and over again, writers of the "bah ORM" club continuously thinking, well I'm not sure, that ORMs are just going to go "poof" one day? I wrote some years back the "SQL is Just As Easy as an ORM Challenge" which demonstrates maybe a few little things that ORMs do for you besides "write SQL", like persisting and loading data between classes and tables that are joined in various very common ways to represent associations between classes:

https://gist.github.com/zzzeek/5f58d007698c4a0c372edd95ab8e0...

this is why whenever someone writes one of these "just write SQL" comments, or wow here a whole blog post! wow. I just shake my head. Because this is not at all what the ORM is really getting you. Plenty of ORMs let you write raw SQL or something very close to it. The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects. Not to mention abstraction over all the other messy things the database drivers do like dealing with datatypes and stuff like that.

It looks like in this blog post, they actually implemented their own nano-ORM that stores one row and queries one table. Well great, now scale that approach up and see how much fun it is to write the same boilerplate XYZRepository / XYZPostgresqlRepository code with the same INSERT / SELECT statement over, and over again. I'd sure want to automate all that tedium. I'd want a one-to-many collection too maybe.

You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings, and still use the ORM, and still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows. But why would anyone really want to, writing SQL for CRUD is really repetitive and tedious. Computer can do that for you.

marcosdumay|2 years ago

The one distinguishing feature of an ORM is that it drops SQL and the relational paradigm and places the developer completely within the OOP world. If you drop this, you have merely a database connector library, with much more freedom of behavior than an ORM.

Yes, since they are an notoriously bad abstraction, every ORM will give you an escape hatch for the minority of tasks that it can't abstract at all. That escape hatch is not in any way a defining feature of the system.

Now, about that extra freedom that you get from dropping the requirement that your connector is an ORM... well, neither your data at rest, the presentation to the user, the abstraction for a service client, nor your validation rules benefit from OOP. Proof of that is that OOP interfaces from all of those things precede the current popular ones, and all of them were abandoned due to fundamental dissonance between the model and the data.

The rationale for an ORM is that, even though none of the interfaces you actually want for your data is best done in OOP, somehow OOP is still the best way to integrate them so that you can reuse data and metadata. This thought is not completely without merit, but there is very little empirical motivation for it, and the tiny amount that exists is more than completely explained by the OOP hype that only started to die around a decade ago.

EDIT: Oh I saw you wrote SQL Alchemy! First, thank you for that great piece of software.

Now, SQL Alchemy does provide a lot of useful ways to postpone the object representation or even to map your values into dumb data. My comment up there is on theoretical limitations, but on practice, I do think it's the best option available on Python. (And maybe about the best option afforded, since the language is intrinsically biased into OOP.)

lelanthran|2 years ago

Okay, you're the expert here, and I'll happily concede that I am not (and apologise in advance if I seem to be disrespectful), but ...

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

I think that that is the problem: mapping a relational dataset to a hierarchical dataset is the digital equivalent of pounding a square peg into a round hole.

I know you've read Ted Neward's "The Vietnam of Computer Science." (it's a short read, so search for it), and he articulated the same thoughts I had each time I had to work with an ORM in a mainstream language.

There is an impedance mismatch of sorts between relational data and hierarchical data. In cases where there is no hierarchy, you can do away with the ORM completely and turn out much easier to read code, for example with `sqlc` and `Go`.[1]

The problem with the ORM is the programming language - one in which hierarchical objects are the idiomatic way to code (all of the mainstream OO languages) is always going to require some wrangling.

> still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows.

Unless you're using `sqlc` and Go, where the automation is provided

[1] Now you may argue that sqlc is technically an ORM, but then where do we draw the line for calling something an ORM? Is it the case that any method for "generating boilerplate for mapping relational datasets to in-program structures" is going to be called an ORM? Because to my mind, the result is so different for in-program representation that they aren't the same thing.

dagmx|2 years ago

I completely agree with you. Every single time someone says: “Just write your own abstraction over an sql generator ”, it eventually devolves into a full blown ORM.

I swear the majority of these opinion posts against ORMs are from people who must have worked in a badly implemented project that left them with a bad experience and they blamed the pattern rather than the technology.

One of the best bits about an ORM is making it consistent for non-db users on the team to simply grok and work with, without creating monstrous and hard to debug joins everywhere. But when badly set up it can lead to a lot of debugging spaghetti. Which is the same as can happen with SQL but I suppose people think that at least there’s one less layer to debug while ignoring the problem is actually how they got where they are and not the technology

I switched a project from manually written sql to sql alchemy on a project that’s used by multiple Oscar nominated films daily for reviews. The SQL version was gross and impossible for the team to manage because it had bloated over the years, with no nice way to detangle the statement generations and joins. SQL Alchemy made it so any one of the technical directors on the team could step in and add new functionality, without serious performance footguns. Instead of me having to clean up bad sql every year (projects would fork per film and merge at the end) to keep performance up, I could trust the ORM to do that for me.

At its worst, it was way too easy for TDs to get the raw SQL to be tens of minutes per review session by structuring their logic incorrectly, but it was so difficult to see. Switching to an ORM meant I could get the performance down to seconds per session and they couldn’t destroy the performance in subtle ways.

jpc0|2 years ago

> and still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows

I've found I would generally only need the handling of database drivers and query building since I'm already writing a validation layer and to add data marshalling to that is pretty trivial.

Likewise practicing YAGNI, what is the chances I need multiple database drivers for different databases, it's extremely unlikely that I'm going to be chopping and changing between different databases so I'm really only writing that code once.

I would argue to start with writing the basic SQL queries and adding an ORM later when you know you actually need it.

It's much easier to onboard someone into pure python code + a db driver vs having to onboard someone to SQLAlchemy, since it is quite a complex piece of software, necessarily complex for what it is trying to achieve but if you don't need it it's not a good fit.

joaodlf|2 years ago

First of all, thank you for SQLAlchemy! If I ever had to make a final choice in how I would interact with a database for a very large project that involves a considerable dev team, I would always bet on SQLAlchemy. Not that I would necessarily like all aspects of it, but when it comes to Python and SQL - “Nobody ever got fired for picking SQLAlchemy.”.

With that out of the way, despite ORMs doing much more than "just writing SQL", it is exactly on that point that I flinch: Most devs should be exposed to SQL. And if your project allows you to build around simple enough abstractions so that you aren't reinventing the wheel, you should definitely be writing SQL. Especially if you don't know SQL yet - which is the growing case of new devs coming into the job market.

You can achieve a lot with SQlAlchemy Core, a tool that I absolutely recommend, but my post is just a simple alternative to get developers to think about their approach. If that results in some devs reconsidering using "full fat" SQLAlchemy and to try SQLAlchemy Core, that's a win for me!

Your gist tries to highlight the difficulty of doing certain things without an ORM. Migrations (as just 1 example) doesn't need to be hard, simple tools like flyway, or migrate (https://github.com/golang-migrate/migrate) achieve a similar result (while also keeping you on the path of writing SQL!). Deep and complex relationships between objects also don't need to be hard - typically people approach this subject with a requirement to be very flexible in the way they want to build queries and objects, but that to me in a sign that maybe they should reconsider their business logic AND reconsider that, just maybe, their project doesn't require all that flexibility, it is fairly straightforward to extend objects and introduce some more complex representations as and when it is needed - will all of this make me write code faster? Absolutely not. That is why you have spent so much time perfecting SQLAlchemy, but then again, I am not advocating for devs to go and replace their usage of ORMs, just presenting an alternative that may or may not fit their needs for a new project + give devs the chance to learn something that the ORM might have taken away.

xp84|2 years ago

> writing SQL for CRUD is really repetitive and tedious

Agreed, and this is the primary reason that ORMs are a necessary tool. And possibly that it's easier to train a junior developer to use one than it is to get them to a basic level of proficiency (and security awareness!) in SQL.

That said, I think this is one of those areas where It's Complicated, because inefficient database calls due to ORM usage is one of the primary ways I see applications completely break down. For most types of apps I've seen, which are very read-heavy and which aren't doing intensive writes, roughly all your writes should probably be using the ORM (including such niceties as validations, default scopes, all that nice stuff), and if it's simple enough, your "show" actions (fetch and display one entity) may be fine as well, but every "dashboard" and "index" action (show many entities, basically things with joins) likely need to be written in SQL.

In my experience (Rails), the object instantiation cost is insane, much greater than the actual time talking to the DB, so not only do you need to write SQL, but you need to handle the data that comes back without instantiating ActiveRecord models.

This is much harder work (more specifically, it needs much greater skill and experience, and is easier to make a mess of) versus using the ORM and models, but only on apps with tiny amounts of data per request, or very low request volume, can the ORM be a serious exclusive option for this task. Unless you want to end up like an app I once was asked to help fix, where they were on the $9,000 per month Heroku postgres instance, and since that's the biggest one, they could "scale" no bigger. (Okay, this wasn't their only problem, their main one was not understanding that you don't sprinkle analytics DB writes all over the place because now the simple high-volume "read" pageviews can't be generated using just a read replica).

SPBS|2 years ago

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

That's easy. It's a database mapping library. Write the query, give it an object and it fills it in or reads from it. You can do this with annotations, or struct tags in the Go world. There's no need to introduce abstractions over SQL joins which I find very off-putting, because the abstractions are never perfect and suddenly you have to learn invented concepts just because you didn't want to write a JOIN in an SQL query but rather have some clever framework introspect some classes and automagically write the JOINs for you.

IKantRead|2 years ago

To add to what you're saying: It's not like the Object-relational impedance mismatch[0] is some great unknown property of ORMs. Since people have been putting ORMs into production designers of these systems have been well aware that you must always chose a trade-off between a full functional object system and a fully relational one.

And the mismatch has two sides. If one's answer is "just use SQL!" then you're going to have new problems dealing with the mismatch coming from the SQL side of things.

The pre-ORM solution to this was not simply to shove a bunch of SQL in your application logic (though this was done), but to have a much, much more complex set of data layer logic. I'm guessing most people today writing about eschewing ORMs in favor of pure SQL have never used: stored procedures, triggers, cascades etc. I personally do miss some of the features from that era of software, but there's a lot more complexity to the "just write SQL" approach than most people realize.

0. https://en.wikipedia.org/wiki/Object%E2%80%93relational_impe...

BeefWellington|2 years ago

While I somewhat agree that a lot of these articles are people who just don't actually try/use the full feature set of ORMs, I don't agree with the overall premise you're presenting that they really do more than write SQL for you. The other things they provide are largely just abstractions around how the queried data is returned and some additional metadata tracking of the relationships. Your estimation of those parts being 60% of the value added is probably generally wrong for most users of SQLAlchemy. Not having to consider or use another language syntax while writing your code is probably closer to 70-80% of the value for most people.

Your example gist is essentially just "ORMs excel at this one thing, anything else is worthless" and also effectively hides code ("the benefit of the library!!!") to make it a very much disingenuous comparison.

> But why would anyone really want to, writing SQL for CRUD is really repetitive and tedious. Computer can do that for you.

Performance.

I wanted to love SQLAlchemy but even in some relatively simple things it generates really asinine queries that take insane amounts of time. It's just usually not noticeably insane amounts of time at first. So you query for some data and wind up with a query that returns in 20ms when there's 20 rows in the database but falls apart when there's 200,000.

It's also bad at projecting analytics type queries onto a transactionally-normalized database, but then most ORMs are not great at that.

I had both of these issues enough times that I instead just opted to start handcrafting queries in those cases and using a single "ResultSet" type class that projects as both a SimpleNamespace and a dict.

ansc|2 years ago

I’m using SQLAlchemy in my job, and have worked with Python for many years. Never have I seen a good case of someone using SQLAlchemy to hydrate objects from raw SQL queries. I’ll definitely admit — I have not gone out of my way to search for it. It seems that it is a common want to do this kind of 60% benefit ORM you speak of, but it’s definitely unclear to me how to pick those parts together with the daunting (and fantastic) piece SQLAlchemy is.

taeric|2 years ago

My only caveat to what you are saying, is that I have yet to see anything that successfully scales up to many tables. I can also count on one hand the number of products I have seen that successfully migrated between big databases in a meaningful way without a ton of ancillary rewriting in the process.

That said, I fully agree that the ORM isn't necessarily the problem. I point the blame at over eager data modelling that ignores the OLTP/OLAP divide. Or that ignores that row creation has a different contract than row editing, such that using the same object for both leads to confusion with missing fields. Heck, even different edits have different contracts, such that mapping a single "object" to a database row is often a mistake.

hot_gril|2 years ago

> writing SQL for CRUD is really repetitive and tedious

If you think of relational DBs as just CRUD machines, an ORM makes total sense, but that's the original mistake.

epgui|2 years ago

> It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects. Not to mention abstraction

That's actually the problem.

I'm sure you're aware that ORMs have this fundamental problem called the ORM impedance mismatch problem. AFAIK it remains unsolved to this day.

> You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings

This is true, but it's like saying you can disassemble a Boeing 737 to build chopsticks. It's true but completely unnecessary.

gmassman|2 years ago

Thanks for you perspective, Mike. Completely agree that the interface between data and code should be handled by a single tool. That tool must meet some minimum complexity, because it’s solving a very hard problem! Also just want to say that my team has benefited greatly from your work on SQLAlchemy, and we appreciate you immensely!

habitue|2 years ago

Honestly, sqlalchemy is such a different breed of ORM, whenever people slander ORMs, I'm imagining they're thinking of like Django ORM or ActiveRecord which are like the duplos to SQLalchemy's Legos.

aforwardslash|2 years ago

> I'd sure want to automate all that tedium. I'd want a one-to-many collection too maybe.

You don't actually need an ORM for that. And some of us (I know I'm a minority) actually want to have pure data objects. Some of us (talking about myself again) actually design schema-first - as in, you use DDL and your code just consumes the definition. Some of us don't care about code knowing about relationships which are an intrinsic part of the data model.

> You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings, and still use the ORM

SqlAlchemy is an amazing piece of software, and it is relevant becaise it is actually part of the problem. You don't really have an ecosystem of database abstraction libraries in python, as you have in other languages. The options are SqlAlchemy or Django ORM. Golang? You have 50 different libraries that implement 100 ways of doing the same stuff. C#? You have probably a dozen libraries besides Entity framework that provide different level of abstractions? Java? You got low level stuff up to high level ORMS. PHP? The same.

Python? You have SqlAlchemy, Django or roll your own. Both SqlAlchemy and Django can do raw queries with little effort - but if you're doing this, arent't you discarding most of the functionality that comes with it - namely, code maintenance? If you have a big project with a bunch handwritten queries (because you can do it, and quite easily, I may add), aren't you just ignoring any of the advantages of the orm? If, when you edit a model, you need to also check handwritten sql queries, you shouldn't be using an orm in the first place.

And with SqlAlchemy and Django, there is no middle ground. You either are all in, code-first definitions, or you are in a world of pain.

That is why specialized applications tend to be rewritten in other languages. Because optimizing the data layer alone often boasts 10-100x performance increase, when compared to this kind of abstraction.

> writing SQL for CRUD is really repetitive and tedious

It is. But between that and SqlAlchemy - in most languages, there are options. Not in Python. Between orm and no orm, I'll take no orm any day of the week - at least I can understand the queries.

tru1ock|2 years ago

I picked up some SQL knowledge through osmosis by using ActiveRecord and I do wish there was a better connect between what the ORM did and the end result. There were some tools to see what code generated what queries but it was not that intuitive and in your face like for example how you would have to deep dive frequently in your generated front end javascript and css code.

In other word I think there are some tooling left on the table that can assist in increasing SQL literacy and comprehension.

ckmar|2 years ago

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

If anyone is looking for this "pure" ORM (no query builder API) in Node there is https://github.com/craigmichaelmartin/pure-orm

TX81Z|2 years ago

I’m sorry but 90% of the time I encounter somebody who swears by an ORM I’ll figure out the reason they use it is because they didn’t know SQL to start with and didn’t commit to learning a new language.

The number of people who know SQL well and still choose an ORM seems to be very, very low in my experience.

hintymad|2 years ago

Do we know what Hibernate did wrong? It used to be very popular among Java engineers but later seemed have become an obscure technology.

siva7|2 years ago

badass mic drop. i will refer to your comment whenever i see this discussion coming up again and it will come up again as long as people are still learning the art of software development.

rtpg|2 years ago

If you're going to end up querying all the fields and putting them into a model like this dataclass anyways... Django can do that for you. If you're going to later pick and choose what fields you query on the first load, and defer other data til later.... Django can do that for you. If you're going to have some foreign relations you want to easily query.... Django can do that for you. If you're doing a bunch of joins and are using some custom postgres extensions for certain fields and filtering... Django can help you organize the integration code cleanly.

I totally understand people having issues with Django's ORM due to the query laziness making performance tricky to reason about (since an expression might or might not trigger a query depending on the context). In some specialized cases there are some real performance hits from the model creation. But Django is very good at avoiding weird SQL issues, does a lot of things correctly the first time around, and also includes wonderful things like a migration layer.

You might have a database that is _really_ unamenable to something like an ORM (like if most of your tables don't have ID rows), but I wonder how much of the wisdom around ORMs is due to people being burned by half-baked ORMs over the years.

I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern. But I think my bias is influenced by always working on software where there are just so many columns per table that it would be way too much busywork to not use something.

noirscape|2 years ago

The main problem I've encountered with complaints surrounding ORMs usually tend to be the result of trying to overfit the ORM in a certain way.

ORMs are, for the most part, good at the CRUD operations - that is to say, they easily translate SELECT, UPDATE, INSERT and DELETE operations between conventional class objects and database rows.

Things they usually aren't very good at are when you start trying to do things that require a lot of optimization - it's very easy to have an ORM accidentally retrieve way more data than you need or to have it access a bunch of foreignkey data too many times (in Django you can thankfully preload the latter by specifying it in a queryset). That's less an issue for basic CRUD, but is an issue if you're doing say, mass calculation and only need one column and none of the foreignkey data for speed reasons.

Basically - an ORM is good but don't let yourself feel suffocated by it. If it's not a good fit for an ORM, then don't do it in the ORM, either use SQL code to do it in the DB server or do a simpler SELECT (in the ORM) and do the complex operation in your regular application before INSERTing it back in the db (if that's a goal for the operation anyway). If it's outside of the CRUD types of DB access already, the extra maintenance overhead you get from having non-ORM database code (if you're doing the SQL approach) in the application would be there anyway, you'd just get a very slow application instead of a hard error, and the latter is easier to troubleshoot (and often fix), while with the former you need to start pulling up profiling tools.

jononor|2 years ago

The most stupid issues with "active record" type ORMs is the implicit queries on member access, especially in collections - leading to the N queries problem. But in SQLAlchemy one can actually turn that off - that is, make it throw an exception when undeclared table dependencies are attempted to be accessed. This restores sanity. And one gets to keep goodies you mention, plus Alembic migrations (mentioned by another). Also one can write direct SQL too with SQLAlchemy, or use the "core" layer to keep a DSL but avoid ORM.

lijok|2 years ago

> I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern. But I think my bias is influenced by always working on software where there are just so many columns per table that it would be way too much busywork to not use something.

You end up writing a query per usecase, rather than writing generic queries that can be stitched together however in the business logic

rowanseymour|2 years ago

> issues with Django's ORM due to the query laziness making performance tricky to reason about

It's infuriating that this is still not a thing you can disable (https://code.djangoproject.com/ticket/30874). Pretty much my only gripe with the Django ORM which I'm a huge fan of (and I also write lots of SQL).

cushychicken|2 years ago

I read this title and immediately thought "...but why wouldn't you just use Django?"

Having written the sort of SQL-inline code the author talks about, then refactored the whole thing to use Django: Django's ORM solves waaaay more problems than it creates in this regard.

aforwardslash|2 years ago

There are plenty of advantages of using a dataclass, being the most obvious the fact that behaves like a pure data object (aka it doesn't have underlying associated resources). Serialization/deserialization of data is dead simple, and a dataclass is a construct you can use as a data object when building 3-tier applications. Having pure data objects also gives way more flexibility when implementing cache strategies.

While this separation isn't common in the Django ecosystem, it is very common in enterprise application design (regardless of usage of an ORM). On complex applications, Django models are often a leaky abstraction (not only because the mentioned resource connection problem, but also issues like for relations they require the inclusion of the target model, it cannot be lazy-loaded; a good example is a nullable foreign key to an optional module that may or may not be loaded), and they actually behave like a variation of the ActiveRecord pattern, that mixes two different scopes - data and operation on data. In many cases this is ok, but in many others this is a problem.

I personally use a repository pattern, coupled with a query builder and something vaguely similar to dataclasses (its a bit more complex in the sense that the data object attribute name can be different from the database field name). It is basically an object mapper with a non-related repository class.

philwelch|2 years ago

> I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern.

One successful pattern I’ve seen treats the database as its own service layer. Service code does not send arbitrary SQL to the database—instead, all of the SQL queries are set as stored procedures in the DB. People sometimes freak out when I say this, so I should clarify that the stored procedures, table schemas and other things of that nature were version controlled and deployed with some minimal build tooling we’d developed in house.

I really liked this pattern. I think anything that you need to talk to across a network should be treated as a service in itself, with a well defined interface. This simplifies testing and monitoring as well. The big risk with an ORM, architecturally, is that you end up treating your database as a sidekick to your service code, or even a dumb data store—some shameful implementation detail your service keeps locked in the basement—when they’re capable of much more than that.

ilovetux|2 years ago

The only problem I have with using the django ORM is that it relies on the django project structure. While there are ways to use the ORM independently, they are full of hacks and trade-offs.

Granted, this problem goes away if you are building a web app or a REST API, but if I just want an ORM for a command line application, I am using django's management command functionality which is OK, but it doesn't really scale easily.

rglullis|2 years ago

Add refactoring, migrations and testing to all the reasons you mentioned, and it quickly becomes an adapted case of Greenspun's Tenth Rule for ORMs.

boxed|2 years ago

This is just reimplementing Djangos ORM, but badly.

ORM queries compose. That's why [Python] programmers prefers them. You can create a QuerySet in Django, and then later add a filter, and then later another filter, and then later take a slice (pagination). This is hugely important for maintainable and composable code.

Another thing that's great about Djangos ORM is that it's THIN. Very thin in fact. The entire implementation is pretty tiny and super easy to read. You can just fork it by copying the entire thing into your DB if you want.

mrj|2 years ago

This is what I came here to say.

For example, I'm working on an project now that long ago added a "sellable things" store that used plain sql. There are many, many stores like this one, but it did some logic to figure out what items are sellable and return the set. Easy, developer happy, ticket closed.

Some time later, it was needed to have "sellable items of a specific type." Well the "sellable things" store was too much to clone so the developer simply pulled all the sellables and filtered in memory. Hey it's Go so it's fast right?

This continued for a couple years and now I'm joining a project with a p99 of >15s. It would have been a natural fit to return a query set of sellable things and the other callers could further refine it however they wanted. Now I'm looking at a ball of logic that should have been in the database and it's beginning to break down at scale.

This article is just that pattern with syntax sugar. It will lead to sadness.

joaodlf|2 years ago

> This is just reimplementing Djangos ORM, bud badly.

I guess this is a good thing, as "reimplementing" Django's ORM is the opposite of what I wanted to do here :)

> ORM queries compose. That's why [Python] programmers prefers them. You can create a QuerySet in Django, and then later add a filter, and then later another filter, and then later take a slice (pagination). This is hugely important for maintainable and composable code.

I don't really disagree, but there are many ways to skin a cat. You can absolutely write maintainable code taking this approach. In fact, I can build highly testable, unit, functional, code following a abstraction very similar to this. The idea that "maintainable and composable code" can only be achieved by having a very opinionated approach to interacting with a database, is flimsy. I offer a contrary point of view: With the Django ORM, you are completely locked in to Django. You build around the framework, the framework never bends to your will. My approach is flexible enough to be used in a Django project, a flask project, a non web dev project, any scenario really. I want complete isolation in my business logic, which is what I try to convey just before my conclusion.

lelandbatey|2 years ago

Can you explain a bit more about the Django ORM being very thin and easy to read? It does seem like the Django ORM is thin (from an architecture perspective), but it doesn't seem to be small, it seems to be pretty big. Maybe I'm not understanding it though, so here's what I see:

The "ORM" part of Django seems to be everything in `django.db.models.Model`, which seems to require you to declare your Models as subclasses of the aforementioned class. Looking into that code though, it seems like the implementation supporting all this is around ~20,000 lines of Python: https://github.com/django/django/tree/main/django/db/models

That doesn't strike me as a super lightweight. For comparison, all of Flask (a Python WSGI web app framework, but mostly a 10+ year old project to compare to, and excluding tests) is ~4,000 lines of Python.

Is there a small subsection of the code in `django/db/models/` that is all that's necessary to use the ORM part? Or maybe I'm missing something about the "core" of the ORM?

sanderjd|2 years ago

Ha, should have read the comments before I wrote mine. Yep, it's this composability aspect that never seems to have occurred to the authors of this kind of think-piece.

I used to be pretty anti-ORM myself because I loathed the complexity of ActiveRecord (in the Rails world), but then I discovered arel, the nice composable relational query builder underneath, and saw the light. A composable layer of abstraction over SQL is critical in an application. (I still prefer raw SQL for analytical queries.)

pjmlp|2 years ago

Just like stored procedures compose, like in any procedural programming language.

zknill|2 years ago

Seems like there's 3 groups of opinions on ORMs:

Firstly (1); "I want to use the ORM for everything (table definitions, indexes, and queries)"

Then second (2), on the other extreme: "I don't want an ORM, I want to do everything myself, all the SQL and reading the data into objects".

Then thirdly (3) the middle ground: "I want the ORM to do the boring reading/writing data between the database and the code's objects".

The problem with ORMs is that they are often designed to do number 1, and are used to do number 3. This means there's often 'magic' in the ORM, when really all someone wanted to do was generate the code to read/write data from the database. In my experience this pushes engineers to adopt number 2.

I'm a big fan of projects like sqlc[1] which will take SQL that you write, and generate the code for reading/writing that data/objects into and out of the database. It gives you number 3 without any of the magic from number 1.

[1] https://sqlc.dev/

tantaman|2 years ago

I'm in a 4th camp: we should be writing our applications against a relational data model and _not_ marshaling query results into and out of Objects at all.

Elaborations on this approach:

- https://news.ycombinator.com/item?id=34948816

- https://github.com/papers-we-love/papers-we-love/blob/main/d...

- https://riffle.systems/essays/prelude/

tracker1|2 years ago

I'm pretty firmly in #2... it's relatively straight forward in a scripting language, and easy enough with something like C# with Dapper. In the end ORMs tend to over-consume, and often poorly. And even when they don't in most cases, they start to in more difficult cases. That doesn't even get into the amount of boilerplate for ORMs. You have to buy in to far more than their query model(s).

karmakaze|2 years ago

Maybe those are the main/popular groupings. Where I fall is that I want typesafe constructions of queries that match the current schema. The query compositions should follow the SQL-style structure so there's no 'shape-mismatch' composing the query using the library. Some may not consider this to be an ORM (though it does map relations to objects).

waffletower|2 years ago

There is definitely a fourth category -- "I want to build database queries natively using the paradigms of the language I am developing with, without use of SQL or an intermediary which translates into SQL."

specialist|2 years ago

> ...this pushes engineers to adopt number 2

Yup. Use of HQL/JPQL is proof that you shouldn't be using an ORM. Just use SQL instead. And once you have some SQL, it's just easier (overall) to do all SQL.

sanderjd|2 years ago

Well put!

NewEntryHN|2 years ago

Any serious application beyond the example given in this article will include conditional SQL constructs which go beyond SQL query parameters and will therefore require string formatting to build the SQL.

Think a simple UI switch to sort some result either ascending or descending, which will require you format either an `ASC` or a `DESC` in your SQL string.

The moment you build SQL with string formatting is the moment you're rewriting the SQL formatter from an ORM, meeting plenty of opportunities to shoot yourself in the foot.

nicoburns|2 years ago

> The moment you build SQL with string formatting is the moment you're rewriting the SQL formatter from an ORM, meeting plenty of opportunities to shoot yourself in the foot.

I used to think this, but at my last company we ended up rewriting all these queries to use conditional string formatting as we found it much more readable. The key was having named parameter binding for that string, so you didn't have to worry about matching up position arguments. That along with JavaScripts template string interpolation actually made the string-formatted version pretty nice to work with.

williamdclt|2 years ago

There’s a world between a query builder and an ORM. The point of ORMs isn’t to build queries, if that’s the only need might as well just use a query builder which is a lot more lightweight and doesn’t come with all the downsides of orms

coldtea|2 years ago

What you describe just needs a query builder (e.g. in Java something like jOOQ), not necessarily an ORM.

tracker1|2 years ago

Depends on the abstraction... for example .Net's extensions for LINQ are pretty good at this, I haven't generally used the LINQ syntax, but the abstraction for query constructs are pretty good, combined with Entity Framework. Of course, there's a lot that I don't care for and would prefer Dapper. In the end, the general environment of .Net dev being excessively "enterprisey" has kept me at bay the past several years.

drdaeman|2 years ago

IMHO the real proper solution is to have an SQL parser, so you can have your SQL represented as an AST, do some operations on it, then compile it back to a query.

Sadly, I'm not aware of any good solutions to this. SQLAlchemy Core can build an operates on an AST, but it doesn't parse raw SQL into a query (so one has to write their queries in Python, not SQL). Some parser libraries I've seen were able to parse the query but didn't have much in terms of manipulations and compiling it back.

Rudism|2 years ago

The next logical step after writing the code given in the article is to abstract common boilerplate SQL into a library so you're not spending 50% of your time writing and re-writing basic SQL insert, update, and select statements every time your models need to be updated. At which point all you've done is write your own ORM.

If you want to go full-blown SQL you can use something like PostGraphile, which allows you to create all of your business entities as tables or views and then write all your business logic as SQL stored procedures, which get automatically translated into a GraphQL API for your clients to consume, but once you move beyond basic CRUD operations and reporting it becomes incredibly difficult to work with since there aren't really any good IDEs that help you manage and navigate huge pure-SQL code bases.

If you're really dead set against using a powerful ORM, it's probably still a good idea to find and use a lightweight one--something that handles the tedious CRUD operations between your tables and objects, but lets you break out and write your own raw queries when you need to do something more complex. I think there's a sweet spot between writing every line of SQL your application executes and having an ORM take care of boilerplate for you that will probably be different in every case but will never be 100% at one end or the other.

hot_gril|2 years ago

The SQL inserts/updates have never felt tedious for me even in large projects, partially owed to careful use of jsonb for big objects where it makes sense (e.g. user settings dicts). Other than that, keeping a tight schema design.

bbojan|2 years ago

The article is missing the code for creating the "users" database table. What about indexes? Migrations? Relations to other tables?

I mean you can just write SQL instead of using the ORM if your project consists of a single table with no indexes that will never change, sure.

joaodlf|2 years ago

When it comes to migrations, I've been fine with https://github.com/golang-migrate/migrate

There are a multitude of extra things to consider, but none of those things are, in my opinion, imperative to having success with SQL in Python. Will it be hard to achieve the same level of convenience that modern ORMs provide? Absolutely. But there is always a cost.

I firmly believe that for most projects (especially in the age of "services"), an approach like this is very much good enough. Also, a great way to onboard new developers and present both SQL and simple abstractions that can be applied to many other areas of building software.

unknown|2 years ago

[deleted]

m000|2 years ago

And good luck with writing tests for your sql code.

somat|2 years ago

sometimes the idea is that the database lives it's own life outside the application. Probably not the case here, but under that viewpoint the application is just one of perhaps many that access the data and as such creating tables, indexes, migrations and relations are none of it's business.

promiseofbeans|2 years ago

> ... Python dot not have anything in the standard library that supports database interaction, this has always been a problem for the community to solve.

Python has built-in support for SQLite in the standard library: https://docs.python.org/3/library/sqlite3.html

uranusjr|2 years ago

Python also has the DBAPI specification, which defines what interface a library must support to be considered a database driver. The author claiming Go’s sql package encourages writing SQL directly while Python doesn’t really seems a bit awkward.

dikei|2 years ago

Also, DB-API 2.0 is a standard that's followed by most database drivers, similar to what JDBC is for Java, though not as strictly enforced.

reportgunner|2 years ago

sqlite-utils[0] is a great library for working with SQLite

[0] https://sqlite-utils.datasette.io/en/stable/

baq|2 years ago

It's fine advice... if you don't ever need to build queries programmatically (you will, probably) and don't care about type checks (you should, it's 2023).

If you don't know what you're doing on the DB-app interface, you're still better off with an ORM most of the time. If you don't know if you know, you don't know (especially if you think you know but details are fuzzy); please go read sqlalchemy docs, no, skimming doesn't count.

If you know what you're doing but are new to Python, use sqlalchemy.core.

PS. zzzeek is a low-key god-tier hacker.

rav|2 years ago

It's fine advice - if you can type check your queries. My colleague wrote a mypy plugin for parsing SQL statements and doing type checking against a database schema file, which helps to identify typos and type errors early: https://github.com/antialize/py-mysql-type-plugin

Jackevansevo|2 years ago

> I have spent enough time in tech to see languages and frameworks fall out of grace, libraries and tools coming and going.

I feel like Django ORM and SQLAlchemy are the de-facto ORMs for Python and have been around for over a decade. If anything I'd recommend juniors to pick one of these over hand rolling their own solution because it's so ubiquitous in the ecosystem.

ckdot|2 years ago

Congrats, you just wrote your own ORM. Please mind that ORM doesn’t necessarily mean ActiveRecord, which could be considered an anti pattern.

khazhoux|2 years ago

He didn't really. The SQL is right there, and this is important.

What I've experienced (unfortunately) across multiple projects is that people who understand databases will write SQL with a nice collection of helper and wrapper functions as needed, and the people that think that databases are mysterious black boxes will reach for ORM. I've seen the ORM-happy teams getting scared at the idea of a million (1,000,000!) rows in a table, and they always neglect to set up even basic indexes or to think through what their JOINs are really doing.

YMMV but that's the pattern I see again and again.

extasia|2 years ago

What's Active Record and why is it an anti pattern?

dotdi|2 years ago

Yes, this!

I've been burned countless times by Hibernate (and consorts) and now I argue in favour of plain SQL wherever I can. I do not imply that Hibernate is in itself bad, I just have collected many years of observations about projects built upon it, and they all had similar problems regarding tech debt and difficult maintenance, and most of them sooner or later ran into situations where Hibernate had to be worked around in very ugly ways.

Yes, I can understand some of the arguments for ORMs, especially when you get a lot of functionality automagically à la Spring Boot repositories.

And since nowadays I have more influence, I do advocate for plain SQL or - the middle ground - projects like jOOQ, but without code generation, without magic, just for type safety. We've been quite happy with this approach for a very large rewrite that is now being used productively with success.

badcppdev|2 years ago

If you are just going to "Just Write SQL" then I really don't think you should be coding your own Object and Repository classes.

My vision of the "Just Write SQL" paradigm would be a "class" or equivalent that would take a SQL command and return the response from the server. Obviously the response has a few different forms but if you're "just writing SQL" then those responses are database responses and not models or collections of models.

(For the record I think simple ORM type functionality is actually quite useful as your use case moves past the scale of small utility scripts.

fmajid|2 years ago

I'd take it a step further and move all SQL into stored procedures and call those using a functional interface. That's because of PostgreSQL's excellent stored procedure support, it might be harder with, e.g. MySQL.

One major benefit of stored procedures, in addition to separation of concerns, is that you can declare them SECURITY DEFINER and give them access to tables the Python process doesn't (in a way reminiscent of setuid), thus improving the security posture dramatically.

One example: you could have a procedure authenticate(login, password) that has access to the table of users and (hashed) passwords, but if the Python app server is compromised it doesn't have access to even the hashed passwords or even the ability to enumerate users of the system.

baq|2 years ago

Last project we've explicitly decided to not have any stored procedures ever since you basically can't test nor deploy them in any sane way. I'm all ears how you make it work.

pjmlp|2 years ago

That is my favourite approach, when I am allowed to have the last work how the DB layer should be like.

tantaman|2 years ago

I was sad to find that the author proclaimed "just write SQL" then fell into the trap of modeling his data as objects.

If you're going to model your data this way... you might as well use an ORM.

A better way is to just just write SQL (or datalog) and model our data, from DB all the way up to the application, as relations rather than objects.

Rather than re-hash, this idea has previously been discussed on HN here: https://news.ycombinator.com/item?id=34948816

davidthewatson|2 years ago

I'm happy to see someone mention peewee, having used it for numerous startup prototypes since its inception.

Coleifer does not get enough credit IMHO:

https://github.com/coleifer/

Peewee has been solid since I began using it a decade ago. Coleifer's stewardship is hard to see at once, but I've interacted with him numerous times back then and the software reflects the mindset of its creator.

radus|2 years ago

Peewee is excellent! I've especially enjoyed using it with SQLite - there are a number of handy extensions and very good support for user defined functions.

jrichardshaw|2 years ago

I'd definitely like to second how great peewee is. It's been a core part of running our telescope for the last 8 years. It strikes a nice balance between power and simplicity, a significant set of useful extensions and great documentation.

Sometimes I find it impossible to believe that @coleifer is just one person. Peewee has 2300 issues and 500 PR's none of which are open and outstanding, and almost all of which he has personally responded too in a genuinely helpful way. He pipes up on Stackoverflow for peewee questions too.

hot_gril|2 years ago

You don't need an ORM, but this isn't how you avoid one. If you're thinking of your DB as a mere object store / OOP connector like this article is, you're better off with an ORM or NoSQL than this basically equivalent DIY solution. It's best to instead learn how to use a relational DB like a relational DB, and the rest will follow.

Also, I'm not one of those people who dislike easy things (and will often whine about JS or Python existing). I'm all for ease and focusing on the business goals. It's just that ORMs and bad schema design will make things harder.

ris|2 years ago

I've been both ways on this, and ultimately I come down heavily in the camp of using ORMs for as far as it makes sense. Why? Sure, the "just use SQL because it's so simple" crowd use seductively simple examples, and indeed for very static use-cases it can be quite neat and simple.

But projects (almost always) grow, and once you need to start conditionally adding filter clauses, conditionally adding joins, things start to get very weird very fast. And no, letting the database connector library do the quoting for you won't save you from SQL injection attacks unless you're just using it to substitute primitive values.

And once all your logic is having to spend more space dealing with conditional string formatting, the clarity of what the query is actually trying to do is long gone.

I'll refrain from digging up the piece of code where I was having to get the escaping correct for a field that was embedded SQL-in-SQL-in-SQL-in-go. And I could hear the echos of the original author's "YAGNI"s haunting me.

robertlagrant|2 years ago

With SQLAlchemy, I come for the type checking. I stay for the Alembic migrations.

TwentyPosts|2 years ago

I feel this, sort of. I taught myself how to code by writing a Python bot (among other things), and eventually needed some sort of database handling to make things work. I decided on teaching myself basic SQLite, and just did it raw with `sqlite3`.

Currently 50% of my anxiety when it comes to my bot is related to database matters. There's no type checking so I gotta be careful when writing them, and stuff might blow up weirdly at runtime. Refactoring tables is also a major pain, or at least was until I (sort of) figured out a 'routine' of how to do it.

It's doable, and I assume that teaching myself some basic SQL and using it in production was a great learning experience, but once I'd reached the point where I was inventing database migration tooling from first principles and considering how to implement that, I realized that I probably just want to look at SQLAlchemy again.

BozeWolf|2 years ago

Indeed. So much more than this simple example. It gets interesting for more advanced use cases. If i now rename a field on the model, it will not be renamed in the database. If i want that to match, i have to change the query. And make a migration. But that is probably another simple blog post.

Putting it all together is another blog post. And if you have colleagues: probably needs documentation. Which you also have to maintain yourself.

WesolyKubeczek|2 years ago

My god, object-relational impedance mismatch seems to be more polarizing than US politics. I've been observing this field for more than 15 years, and it's always "JUST WRITE SQL!!!!!" versus "DRY!!! DRY!!! USE ORMs SO YOU DON'T HAVE TO WRITE SQL!!!! BUSINESS LOGIC!!!" shouting matches. It's like this topic itself takes 30 IQ points away from each participant and the conversation then devolves into complete chaos.

Maybe there's some professional trauma at work, as many of us have been traumatized by shitty databases and shitty code working with them alike, ORM or not. But ORMs do come and go, promise bliss, deliver diddly, and I'm reading the same stuff I've been reading in 2008, as if nothing ever changed since.

sergioisidoro|2 years ago

I've used Rails AR and Django/SQL Alchemy orms, and the more I use it, the more I wish for a fusion of both.

Django ORM is amazing for Schema management and migrations, but I dislike their query interfaces (using "filter" instead of "where"). I really like Rails AR way of lightly wrapping SQL, with a almost 1-1, and similar names, but does not have a migration manager - and there is always the chance that your schema and your code will diverge.

If I would get a Schema / migration manager, that would allow to do type checks and that would work well with a language server for autocompletes, but use SQL or a very very thin wrapper around SQL, that would be my Goldilock solution.

joaodlf|2 years ago

This is also why I really like Peewee in Python. If I am not going to write SQL, at least give me an API that looks and feels like it. When I look at Peewee code, I can often see the end result query.

chucke|2 years ago

Thea answer to your prayers already exists: http://sequel.jeremyevans.net/.

By far the best database toolkit (ORM, query builder, migration engine) I have seen for any programming language.

pak9rabid|2 years ago

Hmm, is AR's Migration framework not a migration manager?

hobbescotch|2 years ago

I’ve been a data engineer for many years and have lots of practice optimizing SQL touching many parts of the language and I still enjoy using SQLAlchemy for its tight, elegant integration with flask/django. Of course some queries make sense to optimize with raw sql but I think here, like with many other things, there’s no black/white conclusion to draw from these situations.

ggregoire|2 years ago

I've been using PugSQL to write SQL in Python [1].

With this package, you write the SQL inside SQL files so you can benefit from syntax highlighting, auto formatting, static analysis, etc. At the difference of writing strings of SQL inside Python files. I'm surprised this is not more popular.

[1] https://pugsql.org

duckmysick|2 years ago

Great concept, based on the Clojure library HugSQL.

Unfortunately it depends on a specific version of SQLAlchemy and won't run with the latest version.

megaman821|2 years ago

Question to the SQL-only people, how would you handle something dynamic? If I have a database of shoes and want people to be able to find them by brand, size, style, etc., what does that look like?

ggregoire|2 years ago

If you don't want to maintain several queries, you could write something like

    SELECT *
    FROM shoes
    WHERE (CASE WHEN :brand_id IS NOT NULL THEN brand_id = :brand_id ELSE TRUE END)
    AND (CASE WHEN :size IS NOT NULL THEN size = :size ELSE TRUE END)
    AND (CASE WHEN :style IS NOT NULL THEN style = :style ELSE TRUE END)

mitch3x3|2 years ago

If statements that either add conditional statements or blank lines to the SQL block. There are a lot of tradeoffs to the pure SQL method but I prefer being able to look up exact snippets in the codebase to find things.

fb03|2 years ago

Alright, let's use the custom approach. And then you need another field. and then you need some slight type checking or (de)serialization, which can change over time. You'll end up writing your own custom, kludgy ORM over time.

I have seen people write their own custom crazy version of GraphQL ("I've created a JSONified way of fetching only some fields from an API call) over ego or just ignorance. It's never a good path.

Why bother moving away from SQLAlchemy, which will do all of that for you in a simplified, type-checked and portable way? SQLAlchemy is literally one of the best ORMs out there, it's ease of use and maintainability is insane.

People that complain about ORMs might have never really used SQLAlchemy. It is that good. I'm a fan and zzzeek is huge force behind why it is so good.

And as always, if you need an escape hatch, you can use raw sql for that ONE sql statement that the ORM is giving you grief for.

plopz|2 years ago

I come from the PHP world and have used a variety of ORMs/query builders in that ecosystem but the most common issue I encounter that they don't handle well is when I want to do a "left join foo where foo.id is null"

ak217|2 years ago

I've seen many iterations of this type of debate by now, and learned to recognize the patterns. The people arguing for the ostensibly simpler solution are really asking you to trust their ability to architect apps out of simpler building blocks without using an abstraction that they don't like. This can work, but it often leads to situations like someone writing a bespoke system, then leaving the job or otherwise imposing extra complexity on the team. In the immediate term, what often gets overlooked is

- The ORM is an externally maintained open-source project with a plurality of contributors; "just write SQL" is not

- The ORM is designed to support the full lifecycle of the application including migrations; "just write SQL" is not

- The ORM is documented to be legible to newcomers; "just write SQL" is not (for all but the simplest of applications)

- The ORM is composable and extensible with opinionated and customizable interfaces for doing so (I've lost track of the number of times I've had my mind blown by how elegant and smart Django and SQLAlchemy's query management tooling is)

- The ORM has a security posture that allows you to both reason about your application's security and receive security updates when bugs are found

- The ORM is a platform for many other modules responsible for different layers of the application (DRF, OpenAPI, django-admin, testing utilities, etc. etc.) to plug into and allow the application to grow sustainably

I now try to guide people to a middle ground. Yes, both Django's and SQLAlchemy's ORMs can be annoying, have performance issues, etc. But for large applications maintained by multiple people over time, their benefits usually outweigh the drawbacks. Both have extensible architectures that allow customization and opinionated restriction of the interface that the ORM presents. If you're unhappy with your organization's ORM, I suggest you try that route first.

p4bl0|2 years ago

I always write SQL code directly, but that's mostly because I actually enjoy writing SQL queries :).

seunosewa|2 years ago

Same here. SQL is arguably the best language for writing queries. That's what it was designed for.

impulser_|2 years ago

If you want to try out something cool, check out

https://github.com/sqlc-dev/sqlc

It's written in Go and it converts your sql migrations and queries into typesafe code that you use access your database.

It currently has a plugin for Python that's in Beta, but what essentially does something similar to what this post is saying.

https://github.com/sqlc-dev/sqlc-gen-python

You write your migrations, and queries and a config file and it does the rest.

monomers|2 years ago

There's a whole family of libraries like that. Yesql is the first I became aware of. The repo has an (incomplete) list of ports to other languages: https://github.com/krisajenkins/yesql#other-languages

dep_b|2 years ago

A good ORM knows when it needs to fuck off. I just want an easy and boiler plate avoiding way to do crud operations on certain tables and map a custom type against a custom query. The lengths I have to go through to just have a custom query in some ORM's is mind-boggling. I remember fighting Microsoft's Linq to SQL or whatever the incarnation was called so hard. I could do it in the end but it fought me all the way to the end.

Improvotter|2 years ago

It might be worth mentioning LiteralStrings from [PEP 675](https://peps.python.org/pep-0675/) and how you should use them to prevent SQL injections. I'm not sure this blog adds much to the discussion when it comes to when to write SQL and when not to. It does not cover the struggles, the benefits, and the downfalls.

waffletower|2 years ago

As a data engineer, the pattern the OP shares is very familiar. I find it much preferable to use of ORMs for wide variety of reasons. However, I view implementing with SQL as an antiquated problem rather than a pragmatic feature. The evolution of this pattern would be to integrate database querying into languages more directly and eliminate SQL entirely. While this could be achieved in Python, I find that a language like Clojure, via functional programming (FP) primitives and transducers, is a natural candidate, particularly for JVM implemented databases. Rather than encapsulating SQL via query building or ORM based APIs, an FP core could be integrated into database engines to allow, via transducers, complex native forms to be executed directly across database clusters. Apache Spark is an analog of this. In particular the Clojure project, powderkeg (https://github.com/HCADatalab/powderkeg), as an Apache Spark interface, demonstrates the potential of utilizing transducers in a database cluster context.

atoav|2 years ago

Recently I was wondering myself whether I should just write SQL as I didn't particularly enjoy working with SQLAlchemy.

Then I discovered peewee. I am happy now.

mkl95|2 years ago

A better title would be "just write your own ORM".

I have used several Python ORMs over the years, both for SQL and NoSQL. SQLAlchemy is the most powerful way of interacting with a relational database I have experienced.

I also write Go, and when I do, I do not use an ORM. But when it comes to Python I know my solution won't be better than SQLAlchemy, so why bother rolling out my own?

sanderjd|2 years ago

I always ctrl-f to search for the word "composability" when I come across arguments like this. I could take or leave ORMs, but relational query-building libraries are invaluable for composability, compared to proliferating mostly-duplicative raw SQL in format strings all over the place.

c120|2 years ago

So far all my projects have targeted a specific database with no reason to change it.

So what I do is write SQL commands, but keep all inside a specific file or module of the project. So that I can decide later to refactor it into an ORM.

I think ORMs are great if you write libraries that target more than one database. Or situations, where you have more than one database and need a proper migration part.

If you don't need migration, but in the worst case can start with a fresh, empty database, then write SQL.

But for production system, the no/manual migration might get old quickly. Writing migration code that just adds fields, indexes or tables is easy. But writing code that changes fields or table structures? Not do much.

Still, you don't need an ORM at the beginning of a project, just don't put SQL everywhere.

Dowwie|2 years ago

You'll eventually write your own dynamic query building logic if you take this development path

lifewallet_dev|2 years ago

I can already see this doesn't have connection pooling which all those ORMs he listed have without you knowing what a connection pool is it just works, and scales, doing that on your own is not easy.

phatboyslim|2 years ago

Technologists have a hard time accepting an established standard. Email is a perfect corollary to this conversation. There is a graveyard of companies that have attempted to "Solve email", yet it is still ubiquitous and attempts to 'improve' it continue to fizzle out. I'm not saying that progress, or an attempt at progress, is pointless, but to argue that writing vanilla SQL is somehow antiquated or archaic is false and OP makes several valid points highlighting why it is a perfectly valid approach.

Sparkyte|2 years ago

Sentiments on this is that sticking close to native as possible reduces coherency issues between anything. Adding layers of abstraction on top of layers of abstraction often reduces contextual understandings and further dilludes the problem solving technique. If the abstraction is truly needed a thurough way to evaluate executions is needed and a proper way to contextualize which that is not. In-line comments or even very easy to navitage documentation but the former thing or even both is superior to the latter.

ruuda|2 years ago

One challenge working with SQL from statically typed languages (including Python + Mypy) is that you have to convert the query inputs/outputs to/from types and it's a lot of boilerplate. I started an experiment to generate this from annotated queries. [1] Python support is still incomplete, but I'm using it somewhat successfully for using SQLite from Rust so far.

[1]: https://github.com/ruuda/squiller

tantaman|2 years ago

similar project that generates types for queries into an intermediate representation that can be consumed by, say TypeScript, to get static Types: https://github.com/vlcn-io/typed-sql

sams99|2 years ago

For those looking for a rubyish approach to this see: https://github.com/discourse/mini_sql

stuaxo|2 years ago

Django isn't just about the ability to programmatically stick together things to make your query or the migrations. It's that, combined with tools that help you debug database issues, and most importantly the patterns that it imposes.

As a Django developer it's straightforward to go from one Django project to another, which isn't the case with other stuff as you don't know where everything is going to be.

molly0|2 years ago

An ORM makes sense if you need to make very dynamic SQL queries, ie advanced logic at runtime.

If your app can work well with static queries then you should not add an ORM.

slotrans|2 years ago

Yes. Just write SQL. SQL is good.

Never use an (active record) ORM. Ever. For any purpose. They are a disastrous idea that should be un-invented.

I have this discussion with a lot of people who claim they cannot imagine working without an ORM and that "surely" using SQL is so much more work blah blah blah. Yet they have never tried! And aren't willing to try!

You should try it.

hprotagonist|2 years ago

If you haven’t yet, check out https://pugsql.org/ . all the power of sqla-core, none of the ORM fuss.

  PugSQL is a simple Python interface for using parameterized SQL, in files, with any SQLAlchemy-supported database.

pharmakom|2 years ago

I no longer feel the need for an ORM. Here's what I do instead:

- immutable record type for each table in the database (could be a data class)

- functions for manipulating the tables that accept or return the immutable record types and directly use SQL

- that's it

you can generate the functions from the database schema if its too much boilerplate.

timmit|2 years ago

Based on my personal experience, I have seen some raw SQL codes about 200 to 1000 lines in some production source codes,

not readable at all, not easy to change, which is a terrible development experience.

I guess if it is simple CRUD, it does not give too much problem, but it will definitely work in a complex case.

jredwards|2 years ago

There's a huge module in our python codebase that approaches building queries in roughly raw SQL. Let me tell you, tracking how data moves from one stage to the next in that "ETL pipeline" is an absolute nightmare. Never again.

metalforever|2 years ago

What happens at big companies is that they will build a custom ORM over time , and it will be way shittier and more vulnerable than if you had just used one in the first place.

semrekkers|2 years ago

Shameless plug, with channel support: https://github.com/semrekkers/sqlz

never_inline|2 years ago

The premise is that Go language users only use standard library SQL package.

Anecdotally, I haven't seen a place where Go is used without something like gorm or sqlc.

rowanseymour|2 years ago

We've always used https://github.com/jmoiron/sqlx which is just the standard package + mapping to/from structs.

CodeWriter23|2 years ago

SQL and ORM both have merit in different situations. Pick the correct tool for the given use case, and don’t be afraid to mix & match IMO.

bastardoperator|2 years ago

No thanks, been there, done that. Writing SQL by hand almost never scales.

izoow|2 years ago

To those who write plain SQL in Python, what do you use for migrations?

tpoacher|2 years ago

A combination of Ibuprofen and Paracetamol usually does the trick.

... oh wait, I thought you said migranes.

sakex|2 years ago

C++: Just write Assembly

bafe|2 years ago

Just write SQL and eventually you will reinvent 50% of any ORM

felipetrz|2 years ago

"without using ORMs"

...

Proceeds to create an ad-hoc ORM.

kervantas|2 years ago

Every ORM basing post is like this. Some dude is dissatisfied with Hiberante/GORM/SQLAlchemy, declares ORMs as an "anti-pattern", then proceeds to reinvent the wheel.

ploppyploppy|2 years ago

Low quality naive summary.

jlnho|2 years ago

How is this article different from your comment, then?

277 comments