Show HN: Octo – Generate a serverless API from an SQL query

eatonphil|5 years ago

I've got a similar project that reads your db schema and generates a Go REST API and a TypeScript/React web interface. (The code-generation is language agnostic so at some point I'd like to add at least a Java REST API as well.) It supports PostgreSQL, MySQL, and SQLite.

Unlike PostgREST/Hasura and some other dynamic tools you can "eject" at this point if you'd like and continue on development without the generator in a language you already know. But I'm working on exposing Lua-based hooks you could carry across whatever backend language you choose to generate and avoid the need to eject.

It has builtin support for paginated bulk GET requests with filtering, sorting, limiting. Built-in support for bcrypt-password authentication and optional SQL filters specified in configuration for authorization of particular endpoints based on session and request metadata.

Still very much a work in progress but the goal is to push the envelope on application boilerplate.

Screenshots are of the example/notes project in the repo.

https://www.dbcore.org/

https://github.com/eatonphil/dbcore

MuffinFlavored|5 years ago

I feel like projects like this work for simple stuff but as soon as you need analytics/insights or actual business logic, you almost always need to just "roll your own" API. Am I wrong? Do other people feel this way? Can anybody think of a few projects they've worked on that would be too complex/a ton of work to make work with these kind of simple template generators?

henryfjordan|5 years ago

This is not dis-similar to what Strapi.io does, although I don't think they realize that's a big selling point from their marketing materials.

With strapi you configure your DB and get code generated in JS that supports a standard CRUD REST API. If you want to add business logic, you can override any particular endpoint you want. Their docs even come with the default implementation for easy copy/paste.

I would love to see research in this space continue, I think it's the future of bringing non-technical people into the product development process (if you can understand building a workflow with Excel/Google Sheets/Airtable, you can understand building an API). I'm excited to check out your project.

jdc|5 years ago

In case anyone else is wondering what's doing the templating in this project, it's Scriban.

https://github.com/lunet-io/scriban

Trisell|5 years ago

The idea is interesting. But it looks like you end up with a yaml file that enumerates each of your tables/endpoints and the queries that back them. So are we exchanging the “complexities” of code, where we have control and testing, for the “lack of complexity” of yaml that becomes unwieldy and untestable in the name of “simplicity?”

stingraycharles|5 years ago

Don’t forget that at some point, you’ll want to generate the yaml from code, because otherwise it becomes impossible to maintain. And quickly you’ll find yourself back at square 1. :)

Glyptodon|5 years ago

One of the things that's not obvious to me about things like this (and other similar tools) is where/how scopes/limitations/permissions are handled. I assume they either are or can be, I just never see it spelled out clearly. What am I missing?

lukeramsden|5 years ago

I can't speak for this project specifically but for some context, Postgraphile's way of solving this, as it only supports Postgres, is to use Postgres's Row Level Security feature, whereby you enforce scopes and permissions at the data layer, as well as using table grants to roles specified in JWTs and such. (https://www.graphile.org/postgraphile/postgresql-schema-desi...)

This project doesn't seem to have any inbuilt AuthZ functionality, so unless your database has that built in like Postgres, or you need something that's not possible in-database, I guess you just... can't.

eatonphil|5 years ago

I tackled this by using SQL as the filter language where the generated code will fill in some context-specific variables like current session user id or current request object id.

This is a little limited in this current form and I'm working to expand SQL filters to match up to HTTP codes so you can say this request needs to have a session otherwise it is a 401, then it also needs to match another filter otherwise it is a 403. But this other endpoint is ok to show without a session if the object being requested is marked public in the db.

There's a lot to think through especially when extending these filters to bulk methods.

https://github.com/eatonphil/dbcore/blob/master/examples/not...

akie|5 years ago

Perhaps I'm old, but who needs an API for an SQL query? I'm not sure I understand the use case, or the advantage of something like this over a regular API call to a backend which would also allow you to do e.g. authentication. Enlighten me?

chadhutchins10|5 years ago

One of the best use cases for this is say you have a backend/internal system and you want other things to start interacting with it. Instead of having to write the api to interface with it, you can just use something like this and with little effort you have an api and can talk with the database.

mritchie712|5 years ago

As part of our product (https://seekwell.io/), we let people access SQL results with an API key and unique endpoints per query. There's also an option to add parameters.

The main use case is giving a data scientist or another application access to the results of a few arbitrary queries without giving them full access to the database. So it's a bit like giving them access to a SQL view, but without them needing to set up a driver, etc. to connect.

sixdimensional|5 years ago

In my case, I have a simple obvious use case.

I work for a large corporation. They want to implement the Bezos mandate [1]. No direct database access between teams, API abstraction for everything.

OK, now let's think in onion layers (or hexagonal/clean architecture if you like). Think of layers of services with different purposes - data services (containing no application/business logic), application services (for business logic, orchestration, process), and UI/UX services (to power differing end user experiences).

Data services don't have to do much - be the data/repository layer, expose productive interfaces for CRUD. Need to read across data sources? Think of federated data services that can combine data on the fly, perhaps like GraphQL.

These kinds of tools are perfect for the first layer of services that abstract the database world from the application world. Just simple services, even ones that effectively let you mimic what SQL queries can do (filter, sort, page, etc.). Individual record oriented interfaces, and bulk oriented interfaces. The query side of CQRS (command query separation).

Many will say, "I don't need all this complexity and layers" - and sure, for smaller or simpler applications, probably not!

But, if you have to operate on any kind of larger scale, with multiple data sources, systems, etc., you end up needing the layers. And these types of tools automate some of the lower layers.

Perhaps when we talk about this, we shouldn't be focusing on "oh it's too complicated", and instead building frameworks or reference architectures that automate away the complexity - so it looks easy again, but now it is more flexible, perhaps easier to scale.

I believe that we are on the cusp still, of an almost fully defined, service based architecture (microservices and server less were just one part of the continuing development of that story). Federation is another part of that story oft ignored. Thinking of the onion as service layers is another part. Erasing the network boundary as a concern through much higher speed internetworking is another part.

Eventually we may come to see, that it is all a big "system", some parts just aren't connected to each other directly.

Sorry, got a bit rant-y at the end there :) Just passionate about sharing this world view with others - as I continue to see this architecture developing!

[EDIT] I wanted to add, it's not just that the use case for this is in a data service layer for automation - from a logical perspective I mean. In big companies such as the one I work for, we never get the resources we need, ironically. We are overwhelmed with demands, and must operate under the Bezos mandate rules. Tools such as Octo are not a panacea, but, they are a good compromise if you have to move fast, they are time-and-cost-savers. And they can get you surprisingly far.

[1] https://www.calnewport.com/blog/2018/09/18/the-human-api-man...

cube2222|5 years ago

Looks great!

If you like this, check out OctoSQL[0]... Also in Go... Though OctoSQL lets you query multiple databases / files / event streams like kafka using SQL from your command line, not as a server, so a fairly different use case, but you should check it out nevertheless!

The naming clash is funny.

[0]: https://github.com/cube2222/octosql

jhoechtl|5 years ago

I realy like your tool. In fact I am slowly integrating it into a solution which will expose a REST API and workspaces identified by a UUID. In our organisation it is so common to receive an Excel or csv which you have to join with the database. Octosql is great for that.

I am wondering what future role badger will play in the future? It would also make a great additional KV backend btw.

ForHackernews|5 years ago

Looks vaguely similar to http://postgrest.org/

lukeramsden|5 years ago

And Hasura, Postgraphile et al too. These, as well as PostgREST also give you much more flexibility in the form of plugins in library mode and other such things - they also generate the actual queries for you, via introspection, as opposed to this which requires you to write the query yourself.

I think there's certainly space for this project, i.e. hand-written queries, on any database (Postg[REST|raphile] both only work with Postgres of course, not sure about Hasura). Not sure it will succeed without support for more forms of Serverless deployment, primarily Lambda.

alexellisuk|5 years ago

Nice to see openfaas featured here and thanks for your PRs to Arkade. I do wonder what your strategy is on connection pooling and authentication?

Also not keen on the passwords being kept in a plaintext file - someone will check that into git. OpenFaaS has secret support which you can use Amal. So does Knative.

mitjam|5 years ago

Reminds me of the venerable Datasette by Simon Willison: https://github.com/simonw/datasette

o1lab|5 years ago

Interesting concept and quite liked the playful logo. Can we pass in env variables to db connection ?

We are in similar space, we take input params of db and generate CRUD apis with Auth+ACL and then APIs are packed into a single lambda function. There is support for serverless framework as well.

[1]: https://github.com/xgenecloud/xgenecloud

amal_kh5|5 years ago

Yes, for example:-

? Enter the database password: ${DB_PASSWORD}

alexzender|5 years ago

Interesting, I've built a similar project that generates GraphQL API based on your database schema - https://okdb.io

bobbywilson0|5 years ago

My main purpose of tools like these has always been prototypes, or hobby one-off type stuff. For SPAs, or a sketch with a Jupyter notebook. They're great for this sort of thing because in my experience, this used to require building some sort of API just to get a simple json interface to the database. It was my understand that the purpose of these types of tools was mostly that.

Are folks using these kind of things for non-trivial production applications?

hn_throwaway_99|5 years ago

I fear that all of these "expose your DB as an API" tools like this, Postgraphile, Hasura, etc. are going to set up folks for a world of hurt down the road. Tightly coupling your end clients to your database schema can make it extremely difficult, if not impossible, to refactor your DB in you need to (which is highly likely).

brycelarkin|5 years ago

I’m building a project using one of those tools. I imagine that difficulty refactoring your database is more a problem of bad schema design than the tool. If you normalize and abstract out the implementation details into Views, I can’t see how refactoring would be difficult. Haven’t built anything at scale with Postgraphile/Harusa, so just wondering if I’m missing anything here.

TylerE|5 years ago

Views make it trivial to decouple what a query returns from the underlying schema

modarts|5 years ago

Looks like someone took this tweet literally https://twitter.com/davecheney/status/1296033304756404225

unknown|5 years ago

[deleted]

rimkms|5 years ago

Logo is looking good , gj

revskill|5 years ago

Do u know any similar tool wwhich supports group by query ?

unknown|5 years ago

[deleted]

amani92|5 years ago

Very impressive, Great job.

WrtCdEvrydy|5 years ago

Your timing is perfect.

ahmadbana|5 years ago

Very interesting projects and can be scalable Keep up the good work

58 comments