top | item 35889196

(no title)

lbrandy | 2 years ago

I agree entirely with the premise here save one subtle bit at the start. I think there is grave danger in reducing "vector database" to "vector search" as equivalent domains and/or pieces of software. I would argue that for "vector databases" there's alot more "database" problems than "vector" problems to be solved.

I fear there's going to be alot of homerolled "vector search" infra that accidentally wanders into an ocean of database problems.

discuss

order

fzliu|2 years ago

Totally agree. It takes _a lot_ to go from Hnswlib to a full-fledged vector database.

Here's an architecture diagram for a production-ready vector database https://milvus.io/docs/architecture_overview.md. Not exactly something you can build in a month.

ShamelessC|2 years ago

> I would argue that for "vector databases" there's alot more "database" problems than "vector" problems to be solved.

Why the need for new technologies then? Databases are well studied. Vector search is relatively easy to implement. Sure, there are some new insights to be gained by respecting a hybrid approach - but they are clearly overvalued.

Machine learning is supposed to make things easier. If you implement vector search across your company's data, there's no reason a LLM couldn't simply do the various SQL-style operations on chunks of that data retrieved via KNN. I'm not aware of this approach being used in practice - but I still think the obvious direction we are heading towards is to be able to talk to computers in plain english, not SQL or some other relational algebra framework.

ashvardanian|2 years ago

Exactly!

It's much easier to start from a database and add vector search as one of the features, then to go backwards. We have spent 7.5 years on the DBMS part, while the vector search can literally be added in a week...

And that's why every major modern database is now integrating such solutions :)

betacat|2 years ago

So many projects forget the MS in DBMS.