top | item 33959121

(no title)

georgewfraser | 3 years ago

The world would be a better place if database drivers were completely abandoned as a way for clients to connect to databases. A standard API, implemented by multiple vendors, is a vastly preferable solution. Arrow Flight is an example of this.

https://arrow.apache.org/blog/2019/10/13/introducing-arrow-f...

discuss

order

jeff-davis|3 years ago

I generally think that's a good idea, but be aware that the protocols are more interesting than you might first imagine, and that leads to a lot of the differences between drivers for different databases.

For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable. Speaking of authentication methods, that also opens up a big topic.

There are also important modes. For instance, the client encoding controls how strings are transcoded when they get to the server. That allows the client to not know/care what the encoding of the database is. You could demand that everything is UTF-8, and that's one philosophy, but not everyone agrees.

In practice, I think it'll be a while before there is consensus on all these points. And even when there is, the standard will need to evolve to handle new auth methods, etc.

If we invent a standard protocol, it will probably be more of a fallback for simple cases when the language framework doesn't offer a driver yet. Still helpful, though.

EthicalSimilar|3 years ago

> For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable.

Off-topic, but I’m surprised more online apps don’t employ something similar.

It would all but eliminate accidental leaks that occur from logs being incorrectly stored / misconfigured, not to mention worries about MITM attacks (useful for corporate networks, or public networks).

Given how many people share usernames, emails, and passwords across sites I find it quite important to mitigate those issues as much as possible.

kardianos|3 years ago

Nice.

This [1] appears to be the SQL layer on top of Arrow Flight specifically about SQL. It seems a bit chatty, where two network requests are required for each query if I read it correctly.

[1] https://arrow.apache.org/docs/format/FlightSql.html

lidavidm|3 years ago

Yup. The chattiness is to account for distributed databases, so you can spread the result set across multiple instances.

That said there is a proposal for base Flight RPC to help allow embedding small results directly into the first response, that mostly needs someone to draft a prototype and push it through. (That doesn't help the case of a large-ish response from a single backend, though; that may also need some work, if we want to get rid of the second request.)