(no title)
Zvez
|
2 years ago
If you give access to your DB directly, your API effectively becomes your API with all the contract obligations of the API. Suddenly you don't completely control your schema: you can't freely change it, you need to add things there for your clients only. I've seen it done multiple times and it always end up poorly. You save some time now by removing the need to build API, but later you end up spending much more time trying to decouple your internal representation from schema you made public.
bennyelv|2 years ago
The system that I'm currently responsible for made this exact decision. The database is the API, and all the consuming services dip directly into each other's data. This is all within one system with one organisation in charge, and it's an unmanageable mess. The pattern suggested here is exactly the same, but with each of the consuming services owned by different organisations, so it will only be worse.
Change in a software system is inevitable, and in order to safety manage change you require a level of abstraction between inside a domain and outside and a strictly defined API contract with the outside that you can version control.
Could you create this with a layer of stored procedures on top of database replicas as described here? Theoretically yes, but in practice no. In exactly the same way that you can theoretically service any car with only a set of mole-grips.
camgunz|2 years ago
IME what data pipelines do is they implement versioning with namespaces/schemas/versioned tables. Clients are then free to use whatever version they like. You then have the same policy of support/maintenance as you would for any software package or API.
oreilles|2 years ago
vbezhenar|2 years ago
Stored procedures technically can do anything, I guess, but at that point you would be better with traditional services which will give you more flexibility.
tacker2000|2 years ago
Also, using views and stored procedures with source control is a pain.
Deploying these into prod is also much more cumbersome than just normal backend code.
Accessing a view will also be slower than accessing an “original” table since the view needs to be aggregated.
vmfunction|2 years ago
madsbuch|2 years ago
versioned views and materialized views are essentially api endpoints in this context. just developed in sql instead of some sane language.
rewmie|2 years ago
Wouldn't it be far simpler to just create a service providing access to those views with something like OData?
DrScientist|2 years ago
So when you provide an API - you don't make all functions in your code available - just carefully selected ones.
If you use the DB schema as a contract you simply do the same - you don't let people access all functions - just the views/tables they need/you can support.
Just like API's, databases have tools to allow you to evolve - for example, maintaining views that keep a contract while changing the underlying schema.
In the end - if your schema dramatically changes - in particular changes like 1:1 relation moving to a 1:many - it's pretty hard to stop that rippling throughout your entire stack - however many layers you have.
DasIch|2 years ago
What are the database tools for access logs, metrics on throughput, latency, tracing etc.? Not to mention other topics like A/B tests, shadow traffic, authorization, input validation, maintaining invariants across multiple rows or even tables...
Databases often either have no tools for this or they are not quite as good.
robertlagrant|2 years ago
moomin|2 years ago
ignoramous|2 years ago
tbf, the idea isn't as novel. Data warehouses, for instance, provide SQL as a direct API atop it.
fatnoah|2 years ago
In a past life, I worked for a large (non-Amazon) online retailer, and "shipping the DB" was a massive boat anchor the company had to drag around for a long time. They still might be, for all I know. So much tech and infra sprung up to work around this, but at some point everything came back to the some database with countless tables and columns where no one knew the purpose, but couldn't change because it might break some random team's work.
qwerty456127|2 years ago
MoSattler|2 years ago
herodoturtle|2 years ago
From the article:
> A less obvious downside is that the contract for a database can be less strict than an API. One benefit to an API layer is that you can change the underlying database structure but still massage data to look the same to clients. When you’re shipping the raw database, that becomes more difficult. Fortunately, many database changes, such as adding columns to a table, are backwards compatible so clients don’t need to change their code. Database views are also a great way to reshape data so it stays consistent—even when the underlying tables change.
Neither solution is perfect (raw read replica vs API). Pros and Cons to both. Knowing when to use which comes down to one's needs.
DarkNova6|2 years ago
My last customer used an ETL tool to orchestrate their data loads between applications, but the only out of the box solution was a DB-Reader.
Eventually, no system could be changed without breaking another system and the central GIS system had to be gradually phased out. This also meant that everybody must had to use Oracle databases, since this was the "best supported platform".
Dudester230602|2 years ago
When that thing fails again they will hopefully settle on a sane monolithic API.
chedabob|2 years ago
Don't get me wrong, the amount of time it saves is massive compared to rolling your own equivalent, but it doesn't take long before you've dug yourself a big hole that would conventionally be solved with a thin API layer.
oldnet|2 years ago
That's why API exists at first place.
camgunz|2 years ago
[0]: https://www.postgresql.org/docs/15/ddl-rowsecurity.html
vbezhenar|2 years ago
londons_explore|2 years ago
knallfrosch|2 years ago
I'm not tooo familiar with DBs, but I know customers. They're going to present custom views to your client SDK. They're going to mirror your read-only DB into their own and implement stuff there. They're going to depend on every kind of implementation detail of your DB's specific version ("It worked with last version and YOU broke it!"). They're going to run the slowest Joins you've ever seen just to get data that belongs together anyway and that you would have written a performant resolver for.
Oh, and of course, you will need 30 client libraries. Python, Java, Swift, C++, JavaScript and 6+ versions each. Compare that to "hit our CRUD REST API with a JSON object, simply send the Authorization Bearer ey token and you're fine."
wmal|2 years ago
marcinzm|2 years ago
Puts|2 years ago
Epa095|2 years ago
archibaldJ|2 years ago
matt_s|2 years ago
Building an API for a new application is a pretty simple undertaking and gives you an abstraction layer between your data model and API consumers. Building a suite of tests against that API that run continuously with merges to a develop/test environment will help ensure quality. Why would anyone advise to just blatantly skip out on solid application design principles? (clicks probably)
camgunz|2 years ago
> Building an API for a new application is a pretty simple undertaking
This is super untrue, backend engineering is pretty hard and complicated, and there aren't enough people to do it. And this is coming from someone who thinks it should be replaced with SaaS stuff like Hasura and not a manual process anymore.
> Building a suite of tests against that API that run continuously with merges to a develop/test environment will help ensure quality.
You can test your data pipelines too; we do at my job and it's a lot easier than managing thousands of lines of PyTest (or whatever) tests.
> Why would anyone advise to just blatantly skip out on solid application design principles?
Because building an API takes a lot of time and money, and maintaining it takes even more. It would be cool if we didn't have to do it.
[0]: https://github.com/benbjohnson