How to use Query Objects to refactor Rails SQL-queries

[+] projectileboy|6 years ago|reply

My rant is not entirely appropriate in the context of Rails, but generally speaking, I cannot for the life of me understand why so many programmers take something very simple - a SQL query - and decide the best way to deal with it is to hide the SQL in some kind of wrapper, thus “simplifying” it. Not once have I used one of these libraries where within hours you run across a non-arcane scenario that the wrapper can’t handle (just yesterday: Postgres type cast to insert an enum value).

[+] pcarolan|6 years ago|reply

For 80% of use cases all you're doing is Person.find(id), Items.top(10), user.email, etc. ORMs are great for productivity, then eventually you run into a complex query where you want to join a bunch of tables, sort, filter, etc. In these cases, the best practice is to do this in SQL in my opinion. All ORMs give you the ability to get an object by executing raw sql. It's not an either or, it's a both and for general productivity.

[+] barbecue_sauce|6 years ago|reply

I think to many programmers, writing a bunch of separate SQL queries seems like it violates the DRY principle. If you start to address this by parameterizing the construction of query strings, it starts feeling like you're writing an ORM any way, so you might as well use one in the first place.

Personally, I don't really have a preference. I think each approach has a place depending on the scope and intent of the project, the size of your team, and their familiarity with raw SQL.

[+] Daishiman|6 years ago|reply

SQL is simple for a certain class of use cases, but as a language it lacks a lot of things, or supports features that are difficult to use in practice:

* It's much easier to manage and manipulate variables in the general-purpose language that constructs queries than in SQL

* It is practically impossible to compose queries or manipulate their AST. SQL is not easy to compose; ORM languages make composability trivial

* The above feature of ORMs make them amenable to the constructs most people use for manipulating queries with common web framework abstractions. In DRF, queries are separated by a base query, filtering by key, filtering by random acceptable criteria, pagination, etc. You just cannot separate the query components in SQL in a way that naturally maps to this abstraction.

* Validating SQL syntax within an IDE that supports a host language is painful; validating chains of ORM events built in the host language is trivial.

* Assigning behavior to resulting rows through objects in a ORM is, again, super trivial, while manipulating raw results in SQL to give them abstract behavior requires building your own intermediate layer. Why build something when it's already there and optimized for your use cases?

* If you're building a library for common behavior that is database-agnostic, ORMs handle this well. You're not forced into learning the ins and outs of uncommon database engines.

[+] Tainnor|6 years ago|reply

I think this comment is tangential to the article.

The author happens to use ActiveRecord (an ORM) in their query object. They might as well have used raw SQL, a simple query builder, or anything else. The more crucial point is that the data access code is isolated from the web related code (the controller) and that the interface relates to the domain, instead of to DB details. This is good practice in any moderately complex code base IMHO. I have seen horrible Rails codebases that just use database calls everywhere in the application, often even in templates.

I think it's a good pattern, but in general (after having written quite some Ruby in the past), I still don't understand the aversion of many Rubyists to consider "architecture" as anything more than "the names of the folders of your Rails app", instead of thinking about layering, bounded contexts, etc.

(As for ORMs, they definitely have drawbacks; but handwriting SQL leaves you vulnerable to other issues too, such as code injection, runtime syntax errors, lack of type safety (admittedly, not relevant in the context of Ruby), etc. These days I mostly just prefer a simple query builder with the option to drop to raw SQL when necessary.)

[+] specialp|6 years ago|reply

We have abandoned ORMs completely for a repository based approach. It is much more satisfying to write a function to extract/insert whatever you want into a database using the full expressiveness of their query language, and power of the database. The premise of having something abstract the DB is cool but with ORMs you lose expressiveness, and are now fighting the syntax and operation of it. The overhead is also large.

[+] shay_ker|6 years ago|reply

I think the general idea to be able to test bits and pieces of the logic. With an ORM like ActiveRecord, Sequel, or SQLAlchemy, you can split up a much larger query into multiple functions and then test each individual function. It makes it a lot easier to reason about future changes.

I don't know if I agree with a "Query Object" - I don't instantiate objects that much these days, but it's nice to put that logic into a different file than, say, your model or controller.

[+] JMTQp8lwXL|6 years ago|reply

Security. Security is the huge implementation detail as to why we aren't writing "SELECT * FROM" + unsanitized_unsafe_user_input. When the Junior on your team doesn't know about prepared statements, at least you can fall back on the ORM.

[+] glennericksen|6 years ago|reply

A "when to" should be added to this "how to". This might be cohesive in a smaller code-base, but it is not pleasant as "query objects" multiply. There are other ways of handling complex queries closer to the model without this type of abstraction. If you reach for this pattern, be sure it's worth the added cost in complexity.

[+] jonathanhefner|6 years ago|reply

I created a gem which wraps all this up in a nice API: https://github.com/jonathanhefner/talent_scout

So the final code from the article would look like:

  class ProductSearch < TalentScout::ModelSearch
    criteria(:search) { |search| where("title ILIKE '%?%'", search) }

    criteria(:from_price) { |from_price| where("price > ?", from_price) }

    criteria(:to_price) { |to_price| where("price < ?", to_price) }

    criteria(:properties) { |properties| joins(:product_properties).where(property_id: properties) }

    criteria(:category_id) { |category_id| where(category_id: category_id) }

    order :price, default: :desc
  end

Brevity is an obvious benefit, but reliability even more so. For example, the code in the article has at least two major mistakes:

1) `from_price` will silently never be applied

2) `sort_type` and `sort_direction` defaults are mixed up and always nil, either of which will raise an exception, but only when the corresponding request params are omitted

Another benefit of using the gem: instances of the `ProductSearch` class from above can be used with the stock Rails form builder (i.e. `form_for` and `form_with model:`).

[+] victorbojica|6 years ago|reply

I'm trying to figure out what is of such interest. The fact that you can do all the querying very neat or hiding all the logic behind a class ? Am i missing something ?

[+] tjpnz|6 years ago|reply

It just puts the DB access logic behind a domain object, people have been doing this or a variation of it for years.

[+] the_gastropod|6 years ago|reply

I’m with you. Indirection for indirection-sake is not a great idea. If this code should be moved anywhere, it’s to a scope on the model.

[+] revskill|6 years ago|reply

In real world, most of our "pattern" become technical debt as codebase grows very fast. My solution is just simply using raw sql query. Hide complexity of query behind another complex "ActiveRecord" is not the solution.

The benefit is separation of concern when your team has a SQL specialist. And no more code technical debt as your project scales.

[+] kburman|6 years ago|reply

I highly disagree with you.

I'm currently working on a Rails project which does exactly this. And it's a true nightmare to maintain or add a feature in it. Only exception for writing raw SQL is when you want CTE.

[+] learc83|6 years ago|reply

I take a hybrid approach and try to build view backed models for complex queries. Then the rest of the team can use those models with ActiveRecord like they normally would.

[+] aantix|6 years ago|reply

Most of the relationships are belongs_to and has_many.

How is a raw SQL more readable and expressive than that?

Sure, there may be an occasional AR expression that could be optimized with a raw SQL query?

But for most associations for information based systems, AR is incredibly descriptive and performant.

[+] revscat|6 years ago|reply

SQL is a living, breathing technical debt.

So what I'm envisioning in your environment is that you've just shifted your problems from a functional OO language to... SQL, which is neither.

And when those queries evolve into the unmanageable beasts that they almost always do, what then? At least with languages that have sensible means for code organization you can break them out into things like TFA is discussing.

SQL -- the language -- is shit, and this is the emperor-wears-no-clothes of our industry. The belief that extensive use of SQL in application code is something all wizards should aspire for is just completely nonsensical.

[+] mhd|6 years ago|reply

I've got some mixed feelings about this, although I don't disagree when it comes to the heart of that suggestion. For one, ORMs certainly reach their limits far too soon, and after that dealing with it is beyond the average use case, maybe even beyond the average developer working on the project.

The problem is that the same is true for SQL when it gets complicated. Do you inner/left/lateral join, do you use 'exists' or 'in', CTEs, window functions etc.

You're talking about an "SQL specialists", and there's a definite need for that here. Now, that takes either someone who had the experience in addition to their regular programming practice (i.e. someone senior), or a dedicated specialists -- which is a hard sell for small teams ('though probably a better idea than a 'build engineer' or 'UX expert', but I digress).

If at all, structure the data and the application features so that you don't need to have anything too complex. If that doesn't seem possible right at the beginning, maybe work out the complex queries/views etc. when you're starting out and everyone is still cooperating, it's really, really troublesome if some lone engineer working on some feature suddenly has to whip out the stack of Celko books.

But beyond that: If you can't fully factor out the complexity out of your application (i.e. in some external reporting tool, ETL job etc.), you'll have to put the query/statement somewhere. Whether that's something with ActiveRecord or a long literal statement. A lot of MVC frameworks have no proper place for that (definitely not in "C", and if it's just in the "M" part, you just postponed the issue). I've seen projects with MVC + Service and/or Query and/or Grid and/or Report and/or SQL directories. And then possibly splitting the content into client code and stored procedures/views/materialized views.

This ain't really a solved problem, nor do we have clearly established best practices here.

[+] lobo_tuerto|6 years ago|reply

I think Ecto has the best of both worlds (a useful abstraction over SQL and is low level enough) and the right approach:

"Ecto is a toolkit for data mapping and language integrated query for Elixir."

https://github.com/elixir-ecto/ecto

[+] gingerlime|6 years ago|reply

2018 ?

I share similar doubts to other commenters.

One semi-positive side effect of moving this into a class however, was testing (kinda?).

I'm never sure how to test certain bits of my code that might do something as simple as `Model.where(:x => "y").where(:y => "z").limit(3)` ... usually ending up with a bunch of `expect(...).to receive_message_chain` which feels extremely clunky (and doesn't even test all where params).

The other day, I even struggled to test an even simpler `User.update(:x => "y")`, because there was another `update` in a before filter on ApplicationController... and it easily becomes a mock-the-world situation.

Am I doing it all wrong?

[+] DougBTX|6 years ago|reply

I like to think about what a test of `Model.where(:x => "y").where(:y => "z").limit(3)` is supposed to be testing. If the test is that the correct query gets generated, then a simple check to ensure that the `where` calls have been made is fine. If the test is that the query returns the correct data, then either a real database or something lighter weight but which handles queries in the same way needs to be used, since it is really a test of how the query is interpreted rather than what the query is.

[+] Dirlewanger|6 years ago|reply

I'd start with removing that `update` out of the `before_filter`; any persistence calls should be only happening in the main controller action.

[+] Tainnor|6 years ago|reply

don't mock what you don't own, IMHO

the classic approach is to use your own domain abstractions, mock those in unit tests, test the domain abstractions e.g. with an in-memory DB, and have some integration tests verify that everything is plugged together correctly

making assertions about your own, domain-inspired APIs makes your tests less brittle and more readable than using "receive_message_chain"

[+] unnu|6 years ago|reply

It's just the concept of interactors used for the specific case of search.

[+] jeremycw|6 years ago|reply

This is the type of code/suggestion that I would have eaten up 10 years ago as a junior. Now this type of code just makes me angry and I spend most of my time turning stuff like this back into what it was before applying this query object abstraction.

There's nothing wrong with the code before he applies this abstraction. It's twenty lines, it's exactly in the first place I would go looking if I need to debug, it directly uses active record so I know there's no hidden BS.

35 comments