top | item 37748726

(no title)

shri_krishna | 2 years ago

What you are talking about is possible to do in regular SQL dbs with extensions. However, when it comes to scaling traditional DBs don't have the necessary tools to do so automatically. Most extensions provide support for an underlying ANN algorithm it implements and there's that and nothing more. Everything else you'll have to hand roll yourself.

Clustering, load balancing, aggregating queries etc are quite different for a vector database in comparison to traditional OLTP databases.

It's the same as difference between OLAP vs OLTP. Both have different underlying architectural differences which make it incompatible for both to run in an integrated fashion.

For instance, in a traditional DB the index is maintained and rebuilt alongside data storage and for scaling you can separate it into read/write nodes. The write nodes typically only focus on building indexes while the read nodes for querying eventually consistent indexes (eventual consistency is achieved by broadcasting only the changed rows rather than sending entire index).

Now it's similar in vector dbs too. You can seperate the indexer from query nodes (which access eventually consistent index). However, the load is way higher than a regular db as the index is humongous/takes a long time to build and sharing the index with query nodes is also more time consuming and resource/network intensive, as you won't be sharing few rows but the entire index itself. It requires a totally different strategy to get all query nodes to be eventually consistent.

The only advantage of traditional DBs also implementing vector extensions is familiarity for the end user. If you are already familiar with postgres you wouldn't want to leave your comfort zone. However, scaling a traditional DB is different from scaling a vector DB and you'll encounter those pain points only in production and will be forced to switch to proper vector databases anyways.

discuss

order

No comments yet.