The world would be a better place if database drivers were completely abandoned as a way for clients to connect to databases. A standard API, implemented by multiple vendors, is a vastly preferable solution. Arrow Flight is an example of this.
Even within the Arrow project, there's still room for drivers just because not every vendor is going to implement the same wire protocol (at least on a feasible timeline). Hence both "ADBC" [1] and Flight SQL [2] (note: NOT a SQL dialect, it is a wire protocol) coexist in complementary niches.
I generally think that's a good idea, but be aware that the protocols are more interesting than you might first imagine, and that leads to a lot of the differences between drivers for different databases.
For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable. Speaking of authentication methods, that also opens up a big topic.
There are also important modes. For instance, the client encoding controls how strings are transcoded when they get to the server. That allows the client to not know/care what the encoding of the database is. You could demand that everything is UTF-8, and that's one philosophy, but not everyone agrees.
In practice, I think it'll be a while before there is consensus on all these points. And even when there is, the standard will need to evolve to handle new auth methods, etc.
If we invent a standard protocol, it will probably be more of a fallback for simple cases when the language framework doesn't offer a driver yet. Still helpful, though.
> For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable.
Off-topic, but I’m surprised more online apps don’t employ something similar.
It would all but eliminate accidental leaks that occur from logs being incorrectly stored / misconfigured, not to mention worries about MITM attacks (useful for corporate networks, or public networks).
Given how many people share usernames, emails, and passwords across sites I find it quite important to mitigate those issues as much as possible.
This [1] appears to be the SQL layer on top of Arrow Flight specifically about SQL. It seems a bit chatty, where two network requests are required for each query if I read it correctly.
Yup. The chattiness is to account for distributed databases, so you can spread the result set across multiple instances.
That said there is a proposal for base Flight RPC to help allow embedding small results directly into the first response, that mostly needs someone to draft a prototype and push it through. (That doesn't help the case of a large-ish response from a single backend, though; that may also need some work, if we want to get rid of the second request.)
lidavidm|3 years ago
[1]: https://arrow.apache.org/docs/format/ADBC.html [2]: https://arrow.apache.org/docs/format/FlightSql.html
jeff-davis|3 years ago
For instance, when setting a user's password in Postgres, you can do the hashing on the client side, even for non-trivial schemes like SCRAM. This means that the password itself never needs to move over the network, and that's very desirable. Speaking of authentication methods, that also opens up a big topic.
There are also important modes. For instance, the client encoding controls how strings are transcoded when they get to the server. That allows the client to not know/care what the encoding of the database is. You could demand that everything is UTF-8, and that's one philosophy, but not everyone agrees.
In practice, I think it'll be a while before there is consensus on all these points. And even when there is, the standard will need to evolve to handle new auth methods, etc.
If we invent a standard protocol, it will probably be more of a fallback for simple cases when the language framework doesn't offer a driver yet. Still helpful, though.
EthicalSimilar|3 years ago
Off-topic, but I’m surprised more online apps don’t employ something similar.
It would all but eliminate accidental leaks that occur from logs being incorrectly stored / misconfigured, not to mention worries about MITM attacks (useful for corporate networks, or public networks).
Given how many people share usernames, emails, and passwords across sites I find it quite important to mitigate those issues as much as possible.
kardianos|3 years ago
This [1] appears to be the SQL layer on top of Arrow Flight specifically about SQL. It seems a bit chatty, where two network requests are required for each query if I read it correctly.
[1] https://arrow.apache.org/docs/format/FlightSql.html
lidavidm|3 years ago
That said there is a proposal for base Flight RPC to help allow embedding small results directly into the first response, that mostly needs someone to draft a prototype and push it through. (That doesn't help the case of a large-ish response from a single backend, though; that may also need some work, if we want to get rid of the second request.)