After all those years, I sometimes feel like the whole ORM thing hasn't made progress or been settled. Which ORMs do you like and what do you like about them? What is the sweet spot? I'm asking this because database I/O has always been my biggest frustration when doing application programming. These days, I do more functional programming and tend to side with the ORM haters but I do occasionally wish I could just abstract out the database more. Anyways, I'd just like to hear HN's thoughts on this.
[+] [-] ciconia|8 years ago|reply
I believe in most cases the efficiencies that can be gained from using an ORM is quickly offset by a substantial dependency on third-party code, limited control of performance, and inability to express more complex SQL expressions, such as CTE's, lateral expressions, JSONB functions and operators etc.
ORM's also tend to pile on lots of functionality, of which most projects will normally only use a small fraction.
For an in-production system I've been maintaining for the last 10 years, I've recently ripped out the ORM code, replacing it with raw SQL queries, and a bit of DRY glue code. Results: less code, better performing queries, and less dependencies.
[1] https://github.com/jeremyevans/sequel
[+] [-] bitexploder|8 years ago|reply
Good ORM lets you use a small slice of their features without any big performance hit in terms of development or actual. With a nice object model it just seems easier to remember and work with code.
When I need complex SQL I just write a raw query... It is trivial and concise in any ORM to drop down to raw SQL when needed.
I can agree they are no silver bullet, but a good ORM does so many chores for me I can't imagine bothering to do them myself, anymore when the pattern of using objects in my code and storing them in a SQL DB solves the problem at hand.
[+] [-] TeMPOraL|8 years ago|reply
I'd like to hear more of your thoughts on that.
My experiences with ORMs are less about too much abstraction, and more about them encouraging wrong abstractions. I'm yet to work with an ORM-using project that doesn't employ ORM objects as business model objects. The ones I've worked with always exploded due to the fact that even with ORM, database model doesn't map 1:1 to how the business logic would like to see the world - and the tension of trying to use one set of objects for both purposes at the same time is what messed up the codebase.
[+] [-] nogbit|8 years ago|reply
You are right, very little of the ORM gets used in the end and one almost always needs to write custom SQL anyway (models don't always map 100% to app layer). Most importanly, as languages change, evolve or dry up it's easier to find or train new talent with SQL experience vs. a language specific ORM.
[+] [-] joeandaverde|8 years ago|reply
Some argue that using an ORM means you can switch underlying database technologies on a whim. I think this is an incredibly weak argument. How often do people truly switch database technologies?
I created a small wrapper around the node postgres library to make querying a little easier.
Have a look at https://github.com/joeandaverde/tinypg - It's a no frills library that makes it easy to execute SQL files as prepared statements and pass parameters as keys on objects.
[+] [-] mewwts|8 years ago|reply
[+] [-] dyeje|8 years ago|reply
[+] [-] tony|8 years ago|reply
- Avoids mistakes when dealing with writing raw SQL queries (SQL is quite repetitive in practice)
- The declarative nature of classes maps well to types and relationships
- The declarative nature of classes maps out well to tables, even with polymorphism [1]
- Keeping "Models" in an ORM often maps out well to migration utility (Alembic, Django Migrations)
- Object-chaining map very well to queries
- ORM objects can be reused and composed
- They can abstract out intricacies across SQL dialects
- They can potentially make it easier to migrate to different SQL servers if no specialized features were used
- Can help avoid common security vulnerabilities like SQL injections
- When something can't be expressed via ORM relationships, they tend to allow the dev to drop down to raw SQL. In the case of SQLAlchemy, there is a core query language [2], too.
- In the case of Django, QuerySet is used as a standard throughout extensions that power a whole community. Plugins that don't even know each other (e.g. django-filter and django-tables2) can operate on the same django queryset to filter/search and sort/display data.
I mention QuerySet/Django ORM quite a bit in a recent blog post at https://www.git-pull.com/code_explorer/django-vs-flask.html.
[1] http://docs.sqlalchemy.org/en/latest/orm/inheritance.html [2] http://docs.sqlalchemy.org/en/latest/core/
[+] [-] barrkel|8 years ago|reply
This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.
The closer an ORM's API is to a monad, the happier you'll be. Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see.
IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed. Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.
[+] [-] jasonkester|8 years ago|reply
If you define your schema in your database and derive your data layer from that, you get everything in your list (apart from that thing that totally always happens where you switch your underlying database technology every few months).
But then you don't have your database defined in two places. And if anybody ever does modify the db by hand, your build will break and it will quickly surface itself as an issue at compile time instead of via an obscure error message somewhere 40 levels deep in the call stack.
[+] [-] richardknop|8 years ago|reply
[+] [-] tastyham|8 years ago|reply
Compared to something like Hibernate or SQLAlchemy that tries to support everything under the sun and can result in a lot confusion when trying to understand what exactly it's doing.
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] slackingoff2017|8 years ago|reply
I never understood the ORM hate. Every place I worked we intermingled raw SQL when needed. Hibernate has a way to clear query caches so you can use raw SQL when you need to. You can just write raw SQL exclusively if you want within hibernate so I don't get how you could lose anything :) .
Still, my experience is mostly with Hibernate. It's extremely mature, meaning reliable, feature complete, and only 0-30% slower than raw queries in most cases. It makes adding support for things like multitenency and audited tables a breeze without losing database agnostic behavior. It makes database upgrades a no-brainer 95% of the time too. It has a built in memory caches that help enormously with common queries. Probably the biggest thing is it makes the database strongly typed so it's possible to refactor. Code refactoring in Java is easy but raw stringified SQL is nearly impossible to fix in any language.
I think the biggest counterpoint to ORM is shitty ORM. Things like SQLAlchemy generate downright horrific SQL slowing everything to a crawl and causing deadlocks. Another honest counterpoint to ORM is the learning curve. Everyone is taught SQL but the ORM is more abstract and harder to reason about, not a fun thing to learn.
TBH I think most ORM's are just poorly done. Putting an object abstraction on a relational database is hard. The only ones I've enjoyed for completeness and performance are Hibernate and to some extent Entity Framework. EF being the easiest to use but a bit slower with less features.
I have heard good things about Dapper but never used it. I like the idea of injecting a SQL DSL into the language itself, wish it was more prevalent.
[+] [-] Capt-RogerOver|8 years ago|reply
[+] [-] wiz21c|8 years ago|reply
Not my experience... Would you care to explain ? SQLA works pretty well for me, as long as I stay in rather simple queries. For example, data analysis queries are next to impossible to express with SQLA, but that doesn't matter much.
For the simple update queries, it just works fine for me. It also allows for very good control on relationships, etc.
[+] [-] fzaninotto|8 years ago|reply
First, the need for data has moved from the server to the client side in API-centric architectures. The backend is often just a dumb HTTP layer on top of a (sometimes relational, sometimes not) data persistence. Heck, I even built apps without server, using PostgREST.
Second, on the client side, we don't need an object-oriented interface to the datastore, because JavaScript. But we do need a powerful declarative query builder, and static data types to build up on. GraphQL provides that.
Third, if the purpose of the ORM was ever to abstract the SQL dialect of the database, GraphQL has resolvers, which turn a GraphQL request into a request for the database of your choice. So it does that job, too.
Last, for the data crunching jobs that you decide to do on the backend, you often need specialized data stores (event store, document store, full-text search index), and ORMs don't address that. And since these jobs are often resource intensive, you need the best performance - therefore falling back to raw SQL in most cases.
ORMs were a good solution for the kind of software architectures that were used for the web in the 00s. Since then, we have different needs, which require different solutions.
Disclaimer: I was the maintainer of a popular open-source PHP ORM [1] for 3 years. I no longer use ORMs.
[1] http://propelorm.org/
[+] [-] ruslan_talpa|8 years ago|reply
[+] [-] coldtea|8 years ago|reply
To my opinion, it was a bad solution to the wrong problem.
For one, we're not that enamoured with objects anymore (what with functional programming, immutability, etc).
Second, SQL and DDL, being declarative, is both a higher abstraction that (at least) most ORMs, and offers more fine level control to the DB at the same time!
Third, people don't really switch databases that often, for the abstraction between different SQL syntaxes to matter.
[+] [-] Scarblac|8 years ago|reply
But third-party libraries shouldn't assume any specific SQL database, so if you want libraries that can do database things, ORMs are very useful.
[+] [-] codeulike|8 years ago|reply
"ORM is the Vietnam of Computer Science" (2006)
https://blog.codinghorror.com/object-relational-mapping-is-t...
[+] [-] _Codemonkeyism|8 years ago|reply
"ORMs are a thing of the past" (2009) http://codemonkeyism.com/orms/
"Be careful with magical code" (2010) "[...] ORMs and in particular Hibernate [...]" http://codemonkeyism.com/beware-magical-code/
[+] [-] barrkel|8 years ago|reply
No ORM comes remotely close to being able to solve this kind of problem. The kinds of things I need to consider:
- analyzing select / predicate / order / etc. for tables used and selectively including tables
- forcing query evaluation order by creating a subtable query with conditions, projecting out an id, and rejoining in the next query up
- analyzing predicates and rewriting predicate trees to push predicates down into subtable queries, or convert predicates into joins, or replace references to columns on other tables with denormalized columns
- gathering heuristics about query set size and changing query generation technique
- splitting queries into parallelizable queries and composing result set client-side
An ORM doesn't really help me. Even a library tuned for symbolic manipulation of SQL doesn't help hugely; my life is made much simpler when I can freely extend the SQL AST classes with my own analytic and rewrite methods.
[+] [-] joncrocks|8 years ago|reply
With a suitably tweaked ORM, you can pass platform-specific hints down to the DB. The platform's query optimiser should be (obviously up for debate!) the one that's taking a query and working out the most efficient way of executing it.
Not saying it always will always pick the best execution, but if you're getting to the point of supplanting the optimiser you're starting to replace the database itself.
At which point you're probably beyonds the realms of what an ORM can and should reasonably doing, you're starting to writing your own custom persistence platform.
[+] [-] flukus|8 years ago|reply
I find an ORM really excels when you've got complicated business logic, the sort of thing where there are dozens of business rules that may or may not be involved in any given installation. Something like this is extremely stateful and and ORM does a great job of tracking this state. In fact, this is what I'd define an ORM as, a state tracker for database modifications. If you don't have much state then you probably won't get much out of an ORM.
Even when using an ORM does make sense, it's important to remember that they don't make sense everywhere. For a performance critical or particularly complicated query then SQL should still be a fallback.
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] madiathomas|8 years ago|reply
Speed of development greatly improved since I switched to EF. Speed of development is what matters most for me when building first version of an application.
[+] [-] TurboHaskal|8 years ago|reply
[+] [-] realusername|8 years ago|reply
ORM haters would likely point out that complicated queries are hard to write with an ORM but these complicated queries are not a massive part of the work and can still be abstracted away in a method using SQL if needed.
Another good point of the ORM: validation of data. There are plenty of validation you cannot enforce in SQL (regex, postcode...).
[+] [-] empthought|8 years ago|reply
Those particular examples are trivial to enforce in every DBMS I am aware of...
[+] [-] bendermon|8 years ago|reply
SQL is the most concise and perfect fit for RDBMS.
However, at the application level there are benefits of using ORM.
- The application itself is usually imperative style as against the declarative nature of SQL.
- Chaining is sometimes more readable and concise. One can chain dynamic filters.
- Abstract the underlying data model with higher level names. SQL eq. of table views.
- Hides the underlying relational model. Which can sometimes be helpful in a large code base. And sometimes a curse.
I normally opt for ORM in Rails/Django web apps. But SQL in
- Performance critical - Report generation, where it might be complex and declarative nature of SQL shines.
[+] [-] falcolas|8 years ago|reply
Just a week ago, I ran into an issue where SQLAlchemy wouldn't support a "in" clause against a list of Model objects. There's a leak that I had to patch by using a list comprehension to generate a list of integer IDs from those objects.
Then I ran into a "you're doing a check against an empty list; we are pretty sure this will come back with no results, but we're going to run the query anyways." Another leak (and it's a valid one), but I have to manually work around it if I don't want that extra query run (I didn't).
Finally, we get the script to the point where it's passing all of the unit tests, and package it up for the integration tests. Those integration tests fail, because SQLAlchemy dynamically imports DB connecters based on the DB being used. The unit tests were running in SQLite in-memory DBs, the production Aurora, requiring a connector to be installed. Another leak.
Ultimately, ORMs can be useful, but they are an abstraction which leaks all over the place. So long as you're OK with the tradeoffs (the rest of my team was; I'm ambivalent, though I was the one who ended up fixing all of these leaks), they're another good tool for your programming toolkit.
[+] [-] lz400|8 years ago|reply
That said, people seem to still use ORMs in Ruby, Java, C#, etc. and frameworks look fairly mature.
[+] [-] pastaelle|8 years ago|reply
[+] [-] mwnorman2|8 years ago|reply
Other posters have mentioned the 'Vietnam of CS' post from many years ago - the main message I would like to highlight is that ORM's are deceptive: it appears easy to create one, but once you spend some time with how people actually use databases, the real-world starts to drag you down (the 'leaky abstraction' meme thrown around). An example of this is: how many ways are there to map a One-to-One relationship (hint: greater than 5)
[+] [-] mwnorman2|8 years ago|reply
See slides 20, 21 and 22: http://images.slideplayer.com/9/2645248/slides/slide_20.jpg
[+] [-] DanielBMarkham|8 years ago|reply
The thing is, you should really only want your database to do a small number of things, and those things you should pay special attention to. You don't want a database that does everything and all runs on autopilot. Then who the heck is coding the system?
I think most any app has a reasonable number of features which map to an also-reasonable-number of function calls back to persistent storage. Identify those and make them first-class citizens in your coding. After all, aside from UI dazzle, that's where all the magic happens.
[+] [-] manigandham|8 years ago|reply
Most of the complaints seem to be from using exceptionally bad ORMs or using them in the 1% of scenarios where performance is critical or very complex queries must be produced.
There really needs to be less blind hate and more understanding of when and where tools should be used (as well as the general quality of the tool itself). That's one of the basic tenets of programming so it's rather unfortunate to see so many misguided arguments here.
[+] [-] richardknop|8 years ago|reply
A good ORM library needs to be well designed, battle tested and mature. Then it provides great benefits over writing just raw SQL. That takes years of development effort.
The problem is that with new languages (Node, Golang, Rust) there hasn't been enough time and effort put into ORM yet to produce something good enough to be usable.
When developing in Java or Python there are great, mature and battle tested ORMs I would use but with these new languages I had bad experience with ORMs as they still seem experimental and have issues therefor stick to raw SQL when working in newer languages.