ORM is an anti-pattern

[+] eftpotrm|14 years ago|reply

Personally, in developing quite a lot of different data-backed apps, I've never really found the problem ORMs are solving to be a hugely significant one; it seems like a 'quick fix' for coders who don't really understand SQL anyway, which always felt to me to be attacking the problem in the wrong place. SQL isn't that hard....

In any case, while I don't dispute that it might offer speed of startup advantages for some developers, it seems no-one is so far disputing that it simply doesn't scale and, if your project really takes off, it will be creating problems. Call me a fogey if you will but I don't like the idea of launching a project that I know will need very substantial rearchitecting too early in its life.

[+] ianterrell|14 years ago|reply

It's hard to believe this isn't just trolling.

Writing a large app directly using SQL is not "hard," but A) it's exceptionally boring, B)it necessitates mental context shifting from your app language to SQL, and C) it requires you to think about nuts and bolts routine below-your-paygrade solved concerns like SQL injection, transaction management, type casting, etc.

Two more points:

I've never seen a large app that either didn't use a popular ORM or didn't end up REINVENTING large chunks of functionality ORMs provide. Good developers write DRY code, and SQL everywhere leads to, eventually, some library that looks remarkably like... an ORM...

Finally, in the case of large web applications (since that's HN's bread and butter), an ORM'd app will not need a "very substantial rearchitecting" to get rid of the ORM. That statement is patently absurd. Early in its life, for scaling (always, IF it becomes necessary) what it will need is dropping down to SQL to tune some aspects of some pages. Architect for doomsday if you like (or, conversely, if you need extra hours to bill) -- or architect to ship.

[+] MartinCron|14 years ago|reply

I'm definitely an old-school code-sql-by-hand guy and I used an ORM (Entity Framework 4.0 Code-First) for the first time for my startup project, a data-intensive online strategy game.

I've found that for many things, it's so much faster in terms of dev time, especially with the super-cool code-first approach, to get things out there and working using the ORM. I can create a new fully-functioning and reliably-working repository class, along with its test double, in around a minute. Really.

Of course, I have found that I've had to replace bits of it with hand-coded SQL for performance reasons. But I've decided to stick this general approach for now because I don't need to substantially rearchitect everything, I can just replace the very few bits that have proven to be an actual performance bottleneck, and keep the development speed for the many places where runtime speed just doesn't matter as much.

[+] ubernostrum|14 years ago|reply

The problem with writing all your SQL "by hand" is that it's needless code duplication.

Set aside databases for a moment. Let's say that instead we're going to generate a bunch of different types of documents, all in the same output format (PostScript or LaTeX or HTML or whatever). Which approach would you take to this problem?

A) For each type of document, manually write out the necessary code from scratch, starting over again each time, or

B) Write some general-purpose libraries which understand the output format and expose an API that you can call programmatically to generate the documents?

I don't think anyone would choose (A) in this case. But many people do choose, and pride themselves on choosing it, as soon as we change from any other use case to generating database queries. Why do we suddenly have to throw away everything we know about writing libraries and avoiding duplication, just to pretend that typing out some SQL makes us "real programmers"?

ORM, in essence, is nothing more than this: a convenient API you can call to do database queries. Why should it be so different?

[+] rapind|14 years ago|reply

It's not that SQL is hard, it's that it becomes very redundant very quickly.

That being said, I use ORMs all the time, but have never really been a massive fan. There's always going to be a little bit of mismatch because object != relational.

I find myself attracted to document databases simply because it feels like there's less of a disconnect between objects and documents.

I do remember an ORM called iBATIS that wasn't too bad though. It actually made my classes look cleaner and didn't treat SQL like a red headed stepchild.

These days, working with ARel doesn't make me feel too dirty.

[+] Spyro7|14 years ago|reply

I have worked with ORMs and with raw SQL. In my personal experience, using raw SQL exclusively becomes time consuming when you need to make the jump between the raw sql results and objects.

With a well designed ORM, it should be possible to have the best of both worlds. One of the features that I value in Django's builtin ORM is the ability to almost seamlessly drop down to raw SQL when needed:

https://docs.djangoproject.com/en/dev/topics/db/sql/

[+] encoderer|14 years ago|reply

I don't know how I could manage deployments as effectively as I do today without Migrations.

And i don't know how I could ever write DB functions more clever than, say, the ones DataMapper gives me.

And I don't know how anybody could prefer

"Select * from TableOne Inner Join TableTwo using(SomeCol) Inner Join TableThree using(OtherCol) where TableOne.foo = ? and TableOne.bar = ?"

to, say, ORMs at their best:

result = TableOne.findByFooAndBar("foo", "bar");

[+] cageface|14 years ago|reply

I was forced to port an existing in-house web app from Postgres to Oracle. Thanks to ActiveRecord I just made some tweaks to the db config file and didn't have to touch a line of code. Of course, it's not always this easy but a good ORM can buy you a lot of portability.

[+] mechanical_fish|14 years ago|reply

My conclusion, drawn from the title alone: The term antipattern has apparently jumped the shark.

Spend five minutes decoding a particularly hairy regular expression? Regexps are an antipattern. Someone writes an inefficient SQL query? SQL is an antipattern. Stub your toe on a curb? Curbs are an antipattern.

[+] NyxWulf|14 years ago|reply

This is my favorite response. One of the reasons I came to HN from Reddit was that the proportion of populist/echochamber articles and comments to well thought out and insightful articles and comments changed unfavorably imo. It makes me sad to see the trend continue on HN, perhaps it is inevitable.

My frustration with this article is that while it makes some valid points that could well be articulated as things to watch for, it instead takes that to the extreme and generalizes a single person's biased experience to the notion of a more universal truth. Unfortunately the world is much more complicated than these type of articles like to admit.

Yes SQL is well known, and solves some sets of these problems, but it also introduces other problems that can overwhelm depending on the environment and the scale. As an example, at my last job I worked at a company that worked in banking and mortgages. They tangle of banking and mortgage laws is different state by state, and in some cases county by county. At this particular company an unusually large percentage of the people know how to write SQL. So vast amounts of business logic were written into SQL. This was stored variously in stored procedures, DTS steps, SSIS Steps, Excel VBA Functions, Access Functions, etc, etc, ad nauseum. May not sound like much until you realize that we counted 50 thousand excel documents on the network stores, 10 thousand access databases, 10 thousand stored procedures, etc, etc. This was for an IT team of about 20 people. Does this mean SQL is an anti-pattern? No, it just means that tools taken to the extreme and used well beyond their design tend to stop functioning well.

ORMS certainly have their problems and can definitely slow things down. However almost all of those arguments could be and were used to argue why assembly was better than C, why C was better than Java, why Java was better than Ruby...almost seems like a pattern. ORMS can save you on maintainable code footprint, and can reduce complexity if they are used correctly. Of course that doesn't mean they can unthinkingly be applied in every situation with no design. The contra is also not true. Unthinkingly applying SQL in all situations is just as perilous and fraught with issues.

I suppose it is against human nature, but I wish for the day when people could have reasonable discussions about issues, recognize that other people have some valid points, and possibly learn from each other and not feel the need to over-sensationalize everything.

[+] absconditus|14 years ago|reply

It seems like the term was flawed from the start. "Anti" does not mean "bad".

[+] mgkimsal|14 years ago|reply

"The whole point of an abstraction is that it is supposed to simplify"

No, it's supposed to abstract. A simplification is supposed to simplify. Often abstractions have the benefit of simplification, but it's not a requirement.

I migrated a project from MySQL to PostgreSQL last summer, and the project was built on Grails with GORM. I had to migrate the data by hand (mostly easy, save for a couple of edge cases like boolean columns), and I had to change the jdbc driver. That was pretty much it. No rewriting of SQL, no changing of escaping logic, etc. I tell a lie - the auto-sequence generation stuff of postgresql wasn't playing nice with some of the GORM identity stuff, and my code had made some assumptions that turned out not to be 100% true. Those likely would have shown up had I written my own stuff rather than relied on GORM, but it was a little bit of a pain to track those down.

All in all, using the ORM abstracted away the need to write against specific database commands and syntax. A byproduct of that was simplification of most use cases of the database, but the key use was abstraction.

[+] InclinedPlane|14 years ago|reply

I've seen a lot of the problems that ORM creates with big projects. The most egregious is lack of control. You'll run into some problem caused by some quirk of your ORM system and you'll dig down into the SQL and learn precisely what's causing it, but you still won't be able to fix it because you don't know the magic voodoo incantations to change your config or the ORM client code in the right way to fix it.

When ORM starts to get in the way like that it really makes you wonder whether it's worthwhile.

[+] Daishiman|14 years ago|reply

Not true. Any acceptable ORM system (and there's a lot of those nowadays) allows you to drop into pure SQL and several levels of abstraction between that and pure object code.

Really, I don't know which library you're using, but I have never seen such an issue with neither Django nor SQLAlchemy, LinqToSql, or Yii's ORM.

[+] sunchild|14 years ago|reply

I would love to hear about an example of this kind of case.

My experience is that ORM queries, used properly, deliver predictable results. If they don't because of a bug in the ORM, can't you just drop down to the native query and move on?

[+] initself|14 years ago|reply

It's not worthwhile if you don't understand what the function of the ORM code is. ORM might be an abstraction, but it doesn't mean it doesn't take some elbow grease to learn. Like anything worthwhile, once you get over the learning curve hump, it becomes a tool that you enjoy using precisely because you understand it.

Even when you have to sometimes write SQL inside your ORM!

[+] gte910h|14 years ago|reply

It sounds like you didn't bother learning how to use the optimization features of your ORM.

Almost all of them have them....plan on using them next time at that stage of the projet, it will turn out fine.

[+] sunchild|14 years ago|reply

What is it about coders and blogs that brings out the "cranky old man" vibe?

An ORM is an insanely convenient way for newbies to use various data stores while avoid learning umpteen different query languages. The teaching value of ActiveRecord for newbies is hard to overstate. It's also a damn nice way to move your application closer to platform independence – a valuable thing in today's PaaS integrated stacks.

If you seek efficiency and performance, don't use an ORM. Lick the freezing cold metal, if you want. Nothing is stopping you from doing what you like!

(Also, I haven't dropped down to SQL since Rails 3.x and meta_where. Yes, I realize that my applications "won't scale". They are appropriately scaled for their intended purposes.)

[+] enjo|14 years ago|reply

It's not just for newbies. I'm competent with SQL, but that doesn't mean I enjoy writing query after query. We use Django and utilize the ORM... we jump down into SQL in the (rare) case that we really need too.

I'm sure at higher scale (we're mid-level at this point) that we might be hand tuning more and more, but that hasn't happened yet.

[+] mistermann|14 years ago|reply

One aspect that always comes up is the "inefficiency" of doing a select * from a table with 30 columns when you only need 4 columns. 99% of the time the millisecond performance difference doesn't matter, and if it does, there is a standard non-default way to handle it in most ORM's.

However, one aspect that is usually conspicuously absent in anti-orm blog posts is that of development time and cost. ORM usage practically guarantees known coded efficiencies, but it lets you implement and pivot really quickly, the time and money saved is easily more than enough to pay for a bump in hardware to overcome the 10% slower code. But to do so is heresy for these people....selecting columns from the database that you do not use is just not done, full stop. Which is cheaper, in dollars, is irrelevant.

[+] epscylonb|14 years ago|reply

This surprised me as well, I was told that in Postgres at least, there is no difference in query speed between selecting a subset of a record and the entire thing.

[+] absconditus|14 years ago|reply

If your database server is on another machine you are now transmitting 30 columns of data over the network instead of 4.

[+] mattmanser|14 years ago|reply

Well, it depends if your columns are blobs or not, it makes a difference (well, at least it used to).

On the other hand, I don't know which ORM you use, but they don't all behave like this, if you tell them you only want columns Name, Age, DOB, they'll only get those columns.

[+] div|14 years ago|reply

Labeling an ORM as an anti-pattern is throwing the baby away with the bathwater. Sure, you will encounter some cases in which your ORM will be a pain in the ass or even actively work against you, but most good ORM's will allow you to talk to the database directly.

For example, both Hibernate and ActiveRecord allow you to just throw straight sql to your database, returning a bunch of key value data.

Which is exactly what a good solution does: provide large gains for the common cases, and get out of the way for the edge case.

[+] perlgeek|14 years ago|reply

> If your data is objects, stop using a relational database.

What does that even mean?

My data, is, well, data. Tables and rows are just ways to represent my data, as are the nested hash and array structures of document storage systems. Oh, and tables and rows are also objects.

What data is "object" and what data is "non-object"?

[+] mixonic|14 years ago|reply

Embedding strings of one language in a second language is an anti-pattern.

I've been at a bunch of NYC dev events recently, and people at both Goruco and Percona Live were hating on ORMs. ORMs have gotten really good in the last few years, I think the haters just haven't been using them.

Show developers a good alternative and they will go there. Some of the basic points made in this article ring true, but the suggested alternatives are weak. ARel is a great start to a non-orm database wrapper in Ruby! Somebody just needs to go there.

[+] unknown|14 years ago|reply

[deleted]

[+] encoderer|14 years ago|reply

Somebody may have already mentioned this, but there's a fantastic essay The Vietnam of Computer Science (2004) on this subject. It's long but so, so worth it.

http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Compute...

[+] wulczer|14 years ago|reply

My first reflex was to write a comment mentioning that essay and then I found this. Wish I could upvote you more...

[+] JunkDNA|14 years ago|reply

Wow, thanks for contributing this to the thread. I had not seen this before. That has to be the best explanation of the tradeoffs/benefits of ORM I've seen anywhere. Most of the content isn't "new" in the sense that if you've done both a lot of SQL and a lot of ORM work, the issues are pretty obvious. But it does a fantastic job of showing why there's just no simple answer currently.

[+] adelevie|14 years ago|reply

I'd be interested in the author's take on ActiveRecord's implementation of Relational Algebra with ARel[1]:

> To manipulate a SQL query we must manipulate a string. There is a string algebra, but its operations are things like substring, concatenation, substitution, and so forth–not so useful. In the Relational Algebra, there are no queries per se; everything is either a relation or an operation on a relation. Connect the dots and with the algebra we get something like “everything is named_scope” for free.

Also, if I couldn't use something like ActiveRecord in my Rails apps, I'd end up re-writing most of its functionality in my model code somewhere. If I don't get Model#find_by_some_attribute() for free, then I have to spend time writing it.

[1] http://magicscalingsprinkles.wordpress.com/2010/01/28/why-i-...

[+] JulianMorrison|14 years ago|reply

I like the iBatis (now renamed mybatis) approach: explicit queries in a separate file that say "input an object of this type reading these fields, output an object of that type setting those fields" and contain raw SQL to be thus parameterized.

This avoids the two largest flaws: live proxies with hidden state pretending to be simple data objects, and SQL being generated with no control. It also avoids a mistake I've only seen two ORMs make but they're common ones: defining its own dialect of not-quite-SQL.

You still get objects mapped in and out of DB queries, it saves you the pointless grunt work of "copy A, put it in B" and it prevents the as-bad-as-ORM anti pattern of "SQL scattered throughout your code".

[+] narrator|14 years ago|reply

I think Ibatis is great for querying data out of the db and ETL operations, especially when I'm using a lot of database features. However, IMHO, Hibernate is better for crud opts on individual records because it takes care of dirty flagging and managing relationships.

[+] ianterrell|14 years ago|reply

"In the long term has more bad consequences than good ones."

I would hypothesize that one of the long term good consequences of healthy ORM options is the existence of the vast majority of database backed applications we all know and love. Sure, when/if they got popular someone had to tune some SQL, but how many of those projects would have even been started without ActiveRecord or Hibernate or EJB3 or CoreData?

The opinion that "ORM is an anti-pattern" is ridiculous nonsense.

[+] SeoxyS|14 years ago|reply

The main problems with ORMs is that they're trying to work around non-object-oriented data stores. Layers of abstractions and ORMs in particular are generally good things—but they can't do magic when it comes to dealing with SQL.

If you're going to be using an ORM, I'd strongly recommend rethinking your data store. Object databases such as MongoDB is a perfect fit, but even a key-value store like Cassandra would be a much better option than SQL. I think it's interesting to note that Core Data, Cocoa's ORM, is one of the fastest data store out there. It uses SQLite, but defines its own schemas. I believe it'll also let you store pure binary data.

[+] chc|14 years ago|reply

Small nitpick: If you're using a non-relational database, the library you use to connect to it is not an object-relational mapping layer.

[+] dgallagher|14 years ago|reply

Yes, Core Data supports storing raw binary data. IIRC is also uses caching (for SQLite stores), and lazily-loads related objects as needed. You can customize this behavior to optimize its memory usage for your code.

[+] DavidMcLaughlin|14 years ago|reply

This seems incredibly naive.

ORMs reduce code duplication. They speed up development, especially when you're treating the underlying data storage as a "dumb" datastore that could just as easily be sqlite or H2 as MySQL or Postgres.

As for ORMs having some sort of negative impact on the queries sent to the underlying database - it really depends on what ORM you use but any ORM I've used had support for pre-loading relationships in advance when required, removing that N+1 problem.

I also want to add that I wrote an ORM for the first company I worked for and when it was finished it was a drop-in replacement for 90% of the queries in our application - and I mean that literally the SQL generated by the ORM was exactly the same as the SQL being replaced. The queries that it couldn't replace (mainly reporting queries) already had an aggressively tuned caching layer in front of them anyway because they were so hairy.

But the real point is this: the performance of the ORM didn't really matter because we were a database driven website that needed to scale - so we had layers upon layers of caching to deal with that issue.

And that is an extremely important point - the way ORMs generalise a lot of queries (every query for an object is always the same no matter what columns you really need) lends itself to extremely good cache performance. Take the query cache of MySQL for example - it stores result sets in a sort of LRU. If you make n queries for the same row in a DB but select different columns each time - you store the same "entity" n times in the query cache. Depending on how big n is, that can cause much worse cache hit performance than simply storing one representation of that entity and letting all n use cases use the attributes they need.

Now, relying on MySQL's query cache for anything would not be smart, but replace it with memcached or reddis or whatever memory-is-a-premium cache and the same point stands. Another example to drive the point home is a result set where you join the result entities to the user query so that you can get all the results back in a single query. In theory this is a great way to reduce the number of queries sent to your DB but if you have caching then there are many times where you could have very low cache hit ratios for user queries since they tend to be unique (for example they use user id) but where you could still get great cache hit performance if certain entities appear often across all those result sets by leaving out the join and doing N+1 fetches instead.

ORMs prevent you from scaling as much as using Python or Ruby over C does.

So I guess that leaves the point about leaky or broken abstractions. Well I would never claim that you can abstract across a whole bunch of databases anyway, I think that's a ridiculous claim that most ORMs make. These types of abstractions when people try to hide the underlying technology are really just a lowest-common-denominator of all the feature sets. So if you chose some technology because you really wanted a differentiating feature then most likely you will find yourself working against such abstractions. Interestingly enough, the dire support for cross-database queries which are perfectly legal in MySQL but not in other vendors is the reason I had to roll my own ORM. But the productivity and maintainability benefits were well worth it.

So yeah I guess what I'm saying is: premature optimization is the root of all evil, there are no silver bullets and performance and scalability is about measuring and optimising where needed. And finally: ORMs are not an anti-pattern.

[+] akeefer|14 years ago|reply

Calling the author naive is, I think, uncharitable. I've also written my own ORM layer, used for large-scale, high-performance enterprise applications, and I largely agree with the original author's post, so I certainly don't think it's naive. It sounds to me like the thoughts of someone who's ran into some real problems in real situations.

I don't disagree with the points you've made around caching, but I do think you're simplifying the problem a bit. Not all performance tuning in DB-intensive applications is around caching, and it often involves query tuning, indexing, and traditional DB-level stuff.

A large part of the abstraction leak around ORMs is around both the caching and that DB-level performance tuning. You have to understand what code is going to generate what queries so that, at the very least, you can tune them by adding in the appropriate indexes in the database. All of a sudden, you're living in SQL land, examining query plans, etc. But if you decide that the change you need to make is to the SQL itself, the ORM layer suddenly gets in your way: you either have to bypass the ORM layer to drop into raw SQL, which at worst is hard to do and at best tends to massively reduce the value proposition of the ORM framework, or you have to try to tweak your code to get it to generate the query that you want, which is often frustrating and far more difficult than just writing the SQL yourself. I don't think I'm that much in the minority of having an experience like, "Hmm, the query I really need to write needs to use an ORDER BY statement that includes a few case clauses . . . now how do I convince this query-generation framework to spit that out so that I don't have to pull back all the results and do the sorting in memory?" It's also worth mentioning that caching doesn't help tune writes, so if scaling your product requires scaling writes, you're probably going to be mucking around in SQL land.

There's a similar problem around query-generation layers that attempt to allow you to just write normal methods and have things executed on the database; because the code is so far removed from the SQL, it makes it really, really easy to write really terribly-performing queries or to write things that will do hugely unnecessary amounts of work.

On a more trivial point, the fetching all columns when you only need a subset of them problem is really an issue sometimes, especially if you A) have to join across a bunch of tables, B) the columns that you want could be retrieved from indexes, rather than requiring actual row reads, or C) the columns that you care about are strictly several removes away from the original search table, but the ORM layer loads everything in between. (For example, Foo->Bar->Baz, my WHERE clause is on Foo, but the only columns I care about are the id on Foo, which is in the index, and a few columns on Baz . . . how do I tell my ORM layer to load nothing from else from Foo and nothing at all from Bar? It's a different problem than pre-fetching, because I just don't want anything loaded.)

Now, that's not to say that ORM layers can't be made to perform; of course they can, pretty much all of them have the sorts of hooks you describe, and there's plenty of empirical evidence to that effect. But sometimes the way you make them perform is by just bypassing them.

There's another abstraction point, which is that supporting multiple databases often leads to a least-common-denominator functionality approach; for example, if you want to use a db-specific spatial data type, the ORM has to either provide db-specific functionality, or it might just not support handling that type of data well. The same often comes to things like db-specific functions or query hints; if the ORM layer doesn't handle those things for you, you have to bypass it and drop into raw SQL if you need them.

So really, the argument is not, "ORM's are not functional and no one should use them," it's related to the value proposition of an ORM layer. The value proposition is "This tool will make your life easier, will save you from having to write SQL, and will help you work across multiple databases." If the tool makes life harder than it otherwise would be, then it's not useful, even if it's still possible to do work in it.

So the question is largely around whether or not they make life easier or not. In the simple case, I think the answer is that yes, they do: they make it easier for beginners to get off the ground, they make it easy to do simple queries and writes, and the performance probably doesn't matter anyway.

When things get more complicated, though, the question becomes a lot less clear. Yes, the ORM layer makes it easier to have structured queries that can be cached . . . it also makes it harder to have one-off queries that can be tuned easily based on exactly what data is needed and tweaked to convince the database to generate the right query plan, and it makes it much harder to look at some DB stats, identify a poorly-performing query, and then map it back to the code that generated that query. I know of applications that have basically had to bypass ActiveRecord more and more as they scaled to just do raw SQL queries because making ActiveRecord perform was simply too hard or not possible.

So personally, I prefer an ORM approach that does minimum stuff to let me do the simple things simply (pull rows back and map them to an object, execute simple queries directly on that table), but that's designed from the ground up with the idea that dropping straight into SQL is a normal, accepted part of the workflow, rather than some one-off thing that you should rarely do. But it really depends on your project and your comfort level with SQL.

[+] ssmoot|14 years ago|reply

An IdentityMap would resolve the issue with your second-layer cache, allowing you to select only the columns necessary, which can be a very significant win.

Ruby O/RMs for example generally take a big hit from loading date/time-columns. If you don't need the columns most of the time, then why load them?

Many (not all) databases will take a significant hit when loading TEXT columns. Again, most of the time you probably don't need them, and could be better served with a 'summary' column.

Just nit-picking really. Though I do generally try to think of caching as a last resort; too often it feels like a hammer for compromised design.

[+] keithnoizu|14 years ago|reply

  Caching can only go so far . . . eventually you are going to need to refactor or tweak schemas and adjust queries. Especially if you are dealing with user centric / heavy write data although yes cache on write can also reduce some strain.

 Personally I would take hand rolled DAOs over magically generated ORM/AR logic any day. The upfront cost is marginally higher but modifying, adjusting and logically grouping data is much more straight forward.

[+] gaius|14 years ago|reply

treating the underlying data storage as a "dumb" datastore that could just as easily be sqlite or H2 as MySQL or Postgres

If you're doing that then you've missed the point of a database and might as well be using flatfiles.

[+] jgrahamc|14 years ago|reply

One of the problems I frequently see is that people complaining about ORM and SQL are thinking mostly of some object wrapping a row (or set of rows) in a table. Then they get into trouble when they want to wrap something more complex involving joins between tables.

All these problems would disappear if people used database views. Then their nice ORM layer (say ActiveRecord) would work perfectly and the nasty joining and updating would be taken care of by the database. I've often wondered if people even realize that database views exist and how powerful they are: http://en.wikipedia.org/wiki/View_(database)

Of course, it's only relatively recently that MySQL has started supporting views properly (in 5.0).

The other nice thing about views is that it means your code using the ORM is simplified because you aren't indirecting through different objects to get at specific values you need to display. It also means that only the necessary data is retrieved from the database.

[+] G_Morgan|14 years ago|reply

That is the other problem. Every database is a mash of semi intentional subtle incompatibility with the standard and a host of non-standard features. When an ORM comes along many try to expose the non-standard features that make sense in some way but end up needing a custom solution for each DB (see how you do an auto sequence ID for an entity bean between Postgres and MSSQL).

So you have a semi portable layer interfacing with a semi portable environment.

[+] gerardo|14 years ago|reply

The relational-object problem is called Object-Relational impedance mismatch, duh!(http://en.wikipedia.org/wiki/Object-relational_impedance_mis...)

For me, fast Web Application development is worth the tradeoff. I usually begin to hate sql on the second month of a project.

[+] wccrawford|14 years ago|reply

I love SQL. What I hate is the tedium of turning the results into something useable. ORMs eliminate that pretty handily.

As they say, "If it didn't exist, I'd have to invent it."

[+] gte910h|14 years ago|reply

This is a person who doen't write many large scale systems:

You will have considerably more (sometimes serious) bugs if you write all your SQL by hand all the time in a app that uses a lot of DB queries.

Yes, you still need to understand what the ORM does when you do certain things, you still need to understand what nasty joins you're writing and all that. But you can let all the minutiae of what you DO write work out well in a rote, well tested manner.

The article smells a bit of a guy who didn't know SQL or had a team member who didn't, and they though just using and ORM would work.

If your app is successful, you will usually need to optimize things. But this is true for SQL or any time saving abstraction as well, not just ORMs

[+] hahainternet|14 years ago|reply

What I found most telling was the talk about doing complex joins in your application.

You do them in views and stored procedures. Your application code should be distinct from the data it sources, so people don't have to wade through 400 lines of terrible PHP just to change 'fullname' to 'concat(etc)'.

Don't torture your DBAs.

[+] code_duck|14 years ago|reply

I had a lot of problems working with ORMs when I was 1-2 years into programming. However, I also felt a lot of resistance to learning to use a framework vs. straightforward, procedural code for web apps.

It's a matter of wanting to take the time to learn another system, API, DSL, what-have-you just in order to work with something you already know - SQL. The dislike of HQL resonates with me - I was wondering why I would ever work with PHP Doctrine's DQL. Building SQL queries out of a sequence of OO method calls seems absurd, too. As the article and comments note, you shouldn't have to know SQL well to use an ORM.

There are definitely issues with the ORM/Framework working against you, too. I love the organization and features in Rails or Django, but I hate when I spend hours working out how to do something that would take 5 minutes in plain PHP. Same with ORMs. Getting them to do the right type of join, not make unnecessary calls, etc. can be a pain. Sometimes it's that I don't know the software well enough, which could either be my own problem or just a reasonable lack of desire to devote my brain to it. Other times it's that the given ORM really does have shortcomings, conceptually and at level of development.

The one ORM I've had the most luck with is Django's. It's straightforward, does what I want, is well documented, and doesn't have too many features.

[+] Swannie|14 years ago|reply

In the comments I'm noticing no one ask: when should or shouldn't you use an ORM?

Most of the discussions are over the merits of either approach, when to me it seems an ORM has many places it belongs. And a few it doesn't.

For most database of record systems, which are a large chunk of your average webapp, an ORM is a god send. When I say DBOR, I mean things like articles, posts, comments, users, products, transaction history. An ORM saves a large amount of work writing SQL, it covers 95% of your queries (particularly insert,update,delete and simple gets) with minimal effort. You create the model, and let it get dealt with by the ORM. Your objects are mainly records. The pain comes when you start wanting to do analytics and interesting reports - but stick with a reporting tool, and keep this out of your application, and you feel less pain.

But this breaks down when you move to a database that represents a complex real world system. If you're working on a model that represents, for example, an electrical distribution system, these are not really records. They represent a vast set of complex interrelations, Of course there are still records, but in isolation, away from the complex relationship of say pole->{location,type,maintenance history,conductors,insulator type}, and conductor->{poles traversed,length,a end location,a end join type,b end location, b end join type,material,material batch number,power circuit carried} etc. etc. Then your queries to "find all customers affected by the pole at these coordinates", requires joins through: pole, conductor, circuit, serviced area, customers... we're moving rapidly to lots of complex queries, where hand crafting really is the way to go.

191 comments