top | item 44528216

(no title)

fathomdeez | 7 months ago

This kind of issue always comes up when people put business logic inside the database. Databases are for data. The data goes in and the data goes out, but the data does not get to decide what happens next based on itself. That's what application code is for.

discuss

order

tsimionescu|7 months ago

The way you model data and store it in your database is fundamentally a part of your business logic. The same data can be modeled in many different ways, with different trade-offs for different use cases. Especially if you have a large amount of data, you can't just work with it as is, you need to know how you will use it and model it in a way that makes the common operations fast enough. As your application evolves, this may change, and even require data migrations.

None of this means you have to or even should use stored procedures, triggers, or listen/notify. I'm just making the point that there is no clean separation between "data" and "business logic".

ehansdais|7 months ago

Can't upvote this enough. The point is not that procedures outside of the DB is wrong, nor is it that procedures should always go into the DB. It's that you should look at the context and decide what the best way to solve the problem is.

chatmasta|7 months ago

The first thing I did when I saw this article was to check the Postgres docs, because I thought "heh, surely they just didn't read the fine print," but the LISTEN/NOTIFY page has zero mentions of "lock" in the entire content.

dathinab|7 months ago

I think, It's because the locking is part of the transaction commit locking, but yes it should be mentioned.

But it's oversimplified a case of "high queue load f* up the availability/timings for other DB operations" (and themself).

And thats a generic problem you have, even if just due to "generic CPU/WAL/disk load" if you put your queue into your DB even iff that specific lock would be somehow solved with some atomic concurrent algorithms or similar (not sure if that even is possible).

So in general make your storage db, and queue a different service (and you cache too), even if it uses the same kind of storage. (Through technically there are clever in-between solutions which run their own queue service but still use you DB for final storage but have a ton of caching, in memory locking etc. to remove a huge part of the load from the DB. )

perlgeek|7 months ago

I really hope somebody reading this article (or HN thread) writes a doc patch to mention that.

I'm unlikely to get it myself today, and by tomorrow I've probably already forgotten it :-(

cryptonector|7 months ago

One can replace LISTEN/NOTIFY with logical replication / CDC. And it's funny because somehow, somewhere, PG must be serializing the writing of the WAL to some degree. So it's not clear to me why LISTEN/NOTIFY needs additional serialization. Perhaps PG should turn NOTIFY into INSERTs on a special table that a worker process watches and turns those inserts into notifies (and deletes the inserts).

physix|7 months ago

That may hold to a certain extent for relational databases where your business model doesn't align well with physical model (tables). Although you might wonder why stored procedures and triggers were invented.

In databases where your domain is also your physical data model, coupling business logic to the database can work quite well, if the DBMS supports that.

https://medium.com/@paul_42036/entity-workflows-for-event-dr...

bevr1337|7 months ago

> the data does not get to decide what happens next based on itself.

Then why bother with a relational database? Relations and schemas are business logic, and I'll take all the data integrity I can get.

jl6|7 months ago

I think an argument can be made that relations, schemas and constraints encode a kind of business logic that is intrinsic to the definition and integrity of the data, while other types of business logic represent processes that may hinge on data but aren’t as tightly coupled to it. Similar to the difference between a primitive type and a function.

I guess some will argue that their business logic is special and really is so tightly coupled to the data definition that it belongs in the database, and I’m not going to claim those use cases don’t exist, but I’ve seen over-coupling far more often than under-coupling.

This is why I say: Applications come and go, but data is forever.

Jailbird|7 months ago

I've seen both of these philosophies. I liken them to religions, the believers are devout. Code is King vs the DB is King.

I'm personally Code is King, and I have my reasons (like everyone else)

platzhirsch|7 months ago

If you want your database to just store bytes, use a key-value store. But SQL gives you schemas and constraints for a reason; they're guardrails for your business logic. Just don’t ask your tables to run the business for you.

IgorPartola|7 months ago

If only different ORMs had more support for triggers and stored procedures. Things would be so much easier if I could do things like denormalize certain frequently accessed fields across tables but with proper ability to update them automatically without having to do them in application code.

whstl|7 months ago

This is one of those absolute statements that cause the kind of problem stated by grandparent. There are lots of those: "Use Postgres for everything", "No business data on the DB", "No methods bigger than 10 lines", "Abstractions only after 3 usages".

Back to the topic: Lots of potential bugs and data corruption issues are solved by moving part of the business logic to the database. Other people already covered two things: data validation and queue atomicity.

On the other hand, lots of potential issues can also arise by putting other parts of business logic to the database, for example, calling HTTPS endpoints from inside the DB itself is highly problematic.

The reality is that the world is not black and white, and being an engineer is about navigating this grey area.

cryptonector|7 months ago

Thank you for bringing some sanity into this discussion.

panzi|7 months ago

So what are your thoughts on constraints then? Foreign keys? Should that only be handled by the application, like Rails does (or did, haven't used in a long time).

fathomdeez|7 months ago

I don't think of those as business logic, per se. They're just validity checks on what the data should look like before it's written to disk - they're not actionable in the way L/N is. That being said, constraints usually end up being duplicated outside the db anyway, but having them where the data rests (so you don't have to assume every client is using the correct constraint code) makes sense.

Lio|7 months ago

Rails fully supports constraints and encourages you to use them.

You can either execute SQL in your migration or use add_check_constraint.

Footkerchief|7 months ago

You still use constraints even if you put all your business logic in stored procedures.

parthdesai|7 months ago

What happens to FKs when you've to partition/shard the db? At a certain scale, they actually hinder the inserts.

cryptonector|7 months ago

You're reaching the wrong conclusion, probably because of confirmation bias. Certainly this LISTEN/NOTIFY problem does not lead to your conclusion, nor does it support it. After all if you were relying on LISTEN/NOTIFY you could instead rely on logical replication decoding / CDC instead. And heck, you could even have a client connected to the same database that uses logical decoding to pick up events worth NOTIFYing about and then does just that, but without burdening any other transactions.

KronisLV|7 months ago

> That's what application code is for.

I've seen people who disagree with that statement and say that having a separate back end component often leads to overfetching and in-database processing is better. I've worked on some systems where the back end is essentially just passing data to and from stored procedures.

It was blazing fast, but working with it absolutely sucked - though for whatever reason the people who believe that seem to hold those views quite strongly.

Cthulhu_|7 months ago

It really depends, but it's also a factor of time, that is, "back in the day", databases were designed to serve many different clients, nowadays a common practice is to have a 1:1 relationship between a database and a client application.

Of course, this is sometimes abused and taken to extremes in a microservices architecture where each service has their own database and you end up with nastiness like data duplication and distributed locking.

sgarland|7 months ago

> Of course, this is sometimes abused and taken to extremes in a microservices architecture where each service has their own database and you end up with nastiness like data duplication and distributed locking.

Not to mention the difficulty in maintaining referential integrity with all of that duplicated data. There are various workarounds, but at that point it feels very much like we’re recreating a shared DB, but shittily, and netting zero benefits.

sgarland|7 months ago

Disagree; these issues come up when people use more advanced features of DBs without having the requisite DB expertise on staff. I’ll give OP that Postgres’ docs do not mention this gotcha (and props to them for drilling down to source code!), but by and large, these issues are from people operating via tech blogs.

The DB is - or should be - the source of truth for your application. Also, since practically everyone is using cloud RDBMS with (usually) networked storage, the latency is atrocious. Given those, it seems silly to rely on an application to react to and direct changes to related data.

For example, if you want to soft-delete customer data while maintaining the ability to hard-delete, then instead of having an is_deleted and/or deleted_at column, have a duplicate table or tables, and an AFTER DELETE trigger on the originals that move the tuples to the other tables.

Or if you want to have get_or_create without multiple round trips (and you don’t have Postgres’ MERGE … RETURNING), you can easily accomplish this with a stored procedure.

Using database features shouldn’t be seen as verboten or outdated. What should be discouraged is not treating things like stored procedures and triggers as code. They absolutely should be in VCS, should go the same review process as anything else, and should be well-documented.

djfivyvusn|7 months ago

That's purely because nobody knows how to write SQL let alone stored procedures. If stored procedures had better devex they'd be used for most of your app.

sgarland|7 months ago

Postgres lets you write stored procedures out of the box in pgSQL, C, Tcl, Perl, and Python. There are also 3rd party extensions for most languages you might want, including Rust and JS.

More broadly, not knowing how to write SQL is a very solvable problem, and frankly anyone accessing an RDBMS as a regular part of their job should know it. Even if you’re always using an ORM, you should understand what it’s doing so you can understand the EXPLAIN output you’ll probably be looking at eventually.