top | item 45169624

Will Amazon S3 Vectors kill vector databases or save them?

280 points| Fendy | 5 months ago |zilliz.com

122 comments

order

simonw|5 months ago

This is a good article and seems well balanced despite being written by someone with a product that directly competes with Amazon S3. I particularly appreciated their attempt to reverse-engineer how S3 Vectors work, including this detail:

> Filtering looks to be applied after coarse retrieval. That keeps the index unified and simple, but it struggles with complex conditions. In our tests, when we deleted 50% of data, TopK queries requesting 20 results returned only 15—classic signs of a post-filter pipeline.

Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

libraryofbabel|5 months ago

> Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

Absolutely this. So much engineering time has been wasted on reverse-engineering internal details of things in AWS that could be easily documented. I once spent a couple days empirically determining how exactly cross-AZ least-outstanding-requests load balancing worked with AWS's ALB because the docs didn't tell me. Reverse-engineering can be fun (or at least I kinda enjoy it) but it's not a good use of our time and is one of those shadow costs of using the Cloud.

It's not like there's some secret sauce here in most of these implementation details (there aren't that many ways to design a load balancer). If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

alanwli|5 months ago

The alternative is to find solutions that can reasonably support different requirements because business needs change all the time especially in the current state of our industry. From what I’ve seen, OSS Postgres/pgvector can adequately support a wide variety of requirements for millions to low tens of millions of vectors - low latencies, hybrid search, filtered search, ability to serve out of memory and disk, strong-consistency/transactional semantics with operational data. For further scaling/performance (1B+ vectors and even lower latencies), consider SOTA Postgres system like AlloyDB with AlloyDB ScaNN.

Full disclosure: I founded ScaNN in GCP databases and am the lead for AlloyDB Semantic Search. And all these opinions are my own.

speedysurfer|5 months ago

And what if they change their internal implementation and your code depends on the old architecture? It's good practice to clearly think about what to expose to users of your service.

tw04|5 months ago

Detailed documentation would allow for a fair comparison of competing products. Opaque documentation allows AWS to sell "business value" to upper management while proclaiming anyone asking for more detail isn't focused on what's important.

apwell23|5 months ago

That would increase surface area of the abstraction they are trying to expose. This is not a case of failure to document.

One should only "poke around" an abstraction like this for fun and curiosity and not with intention of putting the finding to real use.

redskyluan|5 months ago

Author of this article.

Yes, I’m the founder and maintainer of the Milvus project, and also a big fan of many AWS projects, including S3, Lambda, and Aurora. Personally, I don’t consider S3Vector to be among the best products in the S3 ecosystem, though I was impressed by its excellent latency control. It’s not particularly fast, nor is it feature-rich, but it seems to embody S3’s design philosophy: being “good enough” for certain scenarios.

In contrast, the products I’ve built usually push for extreme scalability and high performance. Beyond Milvus, I’ve also been deeply involved in the development of HBase and Oracle products. I hope more people will dive into the underlying implementation of S3Vector—this kind of discussion could greatly benefit both the search and storage communities and accelerate their growth.

redskyluan|5 months ago

By the way, if you’re not fully satisfied with S3Vector’s write, query, or recall performance, I’d encourage you to take a look at what we’ve built with Zilliz Cloud. It may not always be the lowest-cost option, but it will definitely meet your expectations when it comes to latency and recall.

Shakahs|5 months ago

While your technical analysis is excellent, making judgements about workload suitability based on a Preview release is premature. Preview services have historically had significantly lower performance quotas than GA releases. Lambda for example was limited to 50 concurrent executions during Preview, raised to 100 at GA, and now the default limit is 1,000.

pradn|5 months ago

Thanks for writing a balanced article - much easier to take your arguments seriously! And a sign of expertise.

jhhh|5 months ago

"That gap isn’t just theoretical—it shows up in real bills."

"That’s not linear growth—it’s a quantum leap"

"The performance and recall were fantastic—but the costs were brutal"

"it’s not a one-size-fits-all solution—it’s the right tool for the right job."

"S3 Vectors is excellent for cold, cheap, low-QPS scenarios—but it’s not the engine you want to power a recommendation system"

"S3 Vectors doesn’t spell the end of vector databases—it confirms something many of us have been seeing for a while"

"that’s proof positive that vector storage is a real necessity—not just “indexes wrapped in a database."

"the vector database market isn’t being disrupted—it’s maturing into a tiered ecosystem where different solutions serve different performance and cost needs"

"The golden age of vector databases isn’t over—it’s just beginning."

"The bigger point is that Milvus is evolving into a system that’s not only efficient and scalable, but AI-native at its core—purpose-built for how modern applications actually work."

qaq|5 months ago

"I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself. That flips the usual assumption on its head." Hmm well start sending full documents as part of context see it flip back :).

heywoods|5 months ago

Egress costs? I’m really surprised by this. Thanks for sharing.

dahcryn|5 months ago

if they use AzureSearch, I fully understand it. Those things are hella expensive

scosman|5 months ago

Anyone interested in this space should look at https://turbopuffer.com - I think they were first to market with S3 backed vector storage, and a good memory cache in front of it.

k9294|5 months ago

Turbopuffer is awesome, really recommend it. Also they have extra features like automatic recall tuning based on you data, option to choose read after write guarantees (trading latency for consistency or vice versa), BM25 search, filtering on the filed and many more.

Really recommend to check them out if you need a vector DB. I tried qdrant and zilli cloud solutions and in terms of operational simplicity turbopuffer just killing it.

https://turbopuffer.com/docs/query

nosequel|5 months ago

Turbopuffer was mentioned in the article.

conradev|5 months ago

  At a glance, it looks like a lightweight vector database running on top of low-cost object storage—at a price point that is clearly attractive compared to many dedicated vector database solutions.
They also didn’t mention LanceDB, which fits this description but with an open source component: https://lancedb.github.io/lancedb/

kjfarm|5 months ago

This may be because LanceDB is the most attractive with a price point of standard S3 storage ($0.023/GB vs $0.06/GB). I also like that Lancedb works with S3 compatible stores, such as Backblaze B2 which is even cheaper (~70% cheaper).

nickpadge|5 months ago

I love lancedb. It’s the only way I’ve found to performantly and cheaply serve 50m+ records of 768 dimensions. Runs on s3 a bit too slow, but on EFS can still be a few hundred millis.

cpursley|5 months ago

Postgres has pgvector. Postgres is where all of my data already lives. It’s all open source and runs anywhere. What am I missing with the specialty vector stores?

CuriouslyC|5 months ago

latency, actual retrieval performance, integrated pipelines that do more than just vector search to produce better results, the list goes on.

Postgres for vector search is fine for toy products or stuff that's outside the hot loop of your business but for high performance applications it's just inadequate.

teaearlgraycold|5 months ago

> Not too long ago, AWS dropped something new: S3 Vectors. It’s their first attempt at a vector storage solution

Nitpick: AWS previously funded pgvector (the slow down in development indicates to me they have stopped). Their hosted database solutions supported the extension. That means RDS and Aurora were their first vector storage solutions.

physicsguy|5 months ago

The biggest killer of vector dbs is that normal DBs can easily store embeddings, and the vector DBs just don’t then offer enough of a differentiator to be a separate product.

We found our application was very sensitive to context aware chunking too. You don’t really get control of that in many tools.

janalsncm|5 months ago

S3 vectors has a topK limit of 30, and if you add filters it may be less than that. So if you need something with higher topK you’ll need to 1) look elsewhere or 2) shard your dataset into N shards to get NxK results, which you query in parallel and merge afterwards.

I also didn’t see any latency info on their docs page https://docs.aws.amazon.com/AmazonS3/latest/API/API_S3Vector...

mediaman|5 months ago

And a topk of 30 also means reranking of any sort is out, except for maybe limited reranking of 30->10, but that seems kind of pointless with today’s LLMs that can handle a bit more context.

catlifeonmars|5 months ago

3) ask TAM for a service quota increase

iknownothow|5 months ago

S3 has much bigger fish in its sight than the measely vector db space. If you see the subtle improvements in features of S3 in recent years, it is clear as day, at least to me, that they're going after the whale that is Databricks. And they're doing it the best way possible - slowly and silently eating away at their moat.

AWS Athena hasn't received as much love for some reason. In the next two years I expect major updates and/or improvements. They should kill off Redshift.

antonvs|5 months ago

> … going after the whale that is Databricks.

Databricks is tiny compared to AWS, maybe 1/50th the revenue. But they’re both chasing a big and fast-growing market. I don’t think it’s so much that AWS is going after Databricks as that Databricks happens to be in a market that AWS is interested in.

softwaredoug|5 months ago

I’m not sure S3 vectors is a true vector database/search engine in the way something like Elasticsearch, Turbopuffer or Milvus is. It’s more a convenient building block for simple high scale retrieval.

I think of a search system doing quite a lot from sparse/lexical/hybrid search, metadata filtering, numerical ranking (recency/popularity/etc), geo, fuzzy, and whatever other indices at its core. These are building blocks for getting initial candidates.

Then you need to be able to combine all these into one result set for your users - usually with a query DSL where you can express a ranking function. Then there’s usually ancillary features that come up (highlighting, aggregations, etc).

So while S3 vectors is a fascinating primitive, I’m not sure I’d reach for it outside specific circumstances.

storus|5 months ago

Does this support hybrid search (dense + sparse embeddings)? Pure dense embeddings aren't that great for specific search, they only hit meaning reliably. Amazon's own embeddings also aren't SOTA.

danielcampos93|5 months ago

I think you would be very surprised by the number of customers who don't care if the embeddings are SOTA. For every Joe who wants to talk GraphRAG + MTEB + CMTEB and adaptive rag there are 50 who just want whatever IT/prodsec has approved

infecto|5 months ago

That’s where my mind was rolling and also if not, can this be used in OpenSearch hybrid search?

hbcondo714|5 months ago

It would be great to have the vector database run on the edge / on-device for offline-first and be privacy-focused. https://objectbox.io/ does this but i would like to see AWS and others offer this as well.

greenavocado|5 months ago

I am already using Qdrant very heavily for code dev (RAG) and I don't see that changing any time soon because its the primary choice for the tools I use and it works well

rubenvanwyk|5 months ago

I don’t think it’s either-or, this will probably become the default / go-to - if you aren’t storing your vectors in your db like Neon or Turso.

As far as I understand, Milvus is appropriate for very large scale, so will probably continue targeting enterprise.

j45|5 months ago

The cloud is someone else's computer.

If it's this sensitive, there's a lot of companies staying on the sidelines until they can compute in person, or limiting what and how they use it.

anonu|5 months ago

If you like to die in a slow and expensive way - sure.

resters|5 months ago

By hosting the vectors themselves, AWS can meta-optimize its cloud based on content characteristics. It may seem like not a major optimization, but at AWS scale it is billions of dollars per year. It also makes it easier for AWS to comply with censorship requirements.

coredog64|5 months ago

This comment appears to misunderstand the control plane/data plane distinction of AWS. AWS does have limited access to your control plane, primarily for things like enabling your TAMs to analyze your costs or getting assistance from enterprise support teams. They absolutely do not have access to your dataplane unless you specifically grant it. The primary use case for the latter is allowing writes into your storage for things like ALB access logs to S3. If you were deep in a debug session with enterprise support they might request one-off access to something large in S3, but I would be surprised if that were to happen.

barbazoo|5 months ago

> It also makes it easier for AWS to comply with censorship requirements.

Does it, how? Why would it be the vector store that would make it easier for them to censor the content? Why not censor the documents in S3 directly, or the entries in the relational database. What is different about censoring those vs a vector store?

j45|5 months ago

Also, if it's not encrypted, I'm not sure if AWS or others "synthesize" customer data by a cursory scrubbing of so called client identifying information, and then try to optimize and model for those scenarios at scale.

I do feel more and more some information in the corpus of AI models was done this way. A client's name and private identifiable information might not be in the model, but some patterns of how to do things sure seem to come up from such sources.

vincirufus|5 months ago

This could be game changing

giveita|5 months ago

Betteridge can answer No to two questions at once!

Fendy|5 months ago

what do you think?

sharemywin|5 months ago

it's annoying to me that there's not a doc store with vectors. seems like the vector dbs just store the vectors I think.