jeadie | 5 months ago | on: Vector database that can index 1B vectors in 48M
jeadie's comments
jeadie | 9 months ago | on: Airport for DuckDB
jeadie | 10 months ago | on: Show HN: TextQuery – Query CSV, JSON, XLSX Files with SQL
jeadie | 1 year ago | on: Pinecone integrates AI inferencing with vector database
Timescale most recently added it but, yes a bunch of others: Weaviate, Spice AI, Marqo, etc.
jeadie | 1 year ago | on: Pg_parquet: An extension to connect Postgres and parquet
jeadie | 1 year ago | on: Pg_lakehouse: Query Any Data Lake from Postgres
jeadie | 1 year ago | on: Ask HN: Who is hiring? (April 2024)
Spice AI provides building blocks for data and AI-driven applications by composing real-time and historical time-series data, high-performance SQL query, machine learning training and inferencing, in a single, interconnected AI backend-as-a-service.
We just launched github.com/spiceai/spiceai, a unified SQL query interface and portable runtime to locally materialize, accelerate, and query data tables sourced from any database, data warehouse, or data lake.
We're hiring experienced software engineers, ideally with Rust and/or Golang production experience. We're focused on large data and distributed systems, experience in these is important too. More details: https://spice.ai/careers#section-open-positions
jeadie | 1 year ago | on: Show HN: Spice.ai – materialize, accelerate, and query SQL data from any source
jeadie | 1 year ago | on: Show HN: Spice.ai – materialize, accelerate, and query SQL data from any source
jeadie | 2 years ago | on: Show HN: Yes, another vector embeddings API
jeadie | 2 years ago | on: GGML – AI at the Edge
jeadie | 2 years ago | on: Weaviate – Open-Source AI Native Vector Database
jeadie | 2 years ago | on: PrivateGPT
jeadie | 2 years ago | on: Ask HN: Seeking a Vector Database for ClickHouse Users – Suggestions Appreciated
jeadie | 2 years ago | on: Ask HN: Seeking a Vector Database for ClickHouse Users – Suggestions Appreciated
jeadie | 2 years ago | on: After All Is Said and Indexed – Unlocking Information in Recorded Speech
Chroma, Pinecone, I guess FAISS/HNSWlib/etc only handle vector operations. Really what I'd want, which Marqo does, is handle everything end to end.
jeadie | 2 years ago | on: After All Is Said and Indexed – Unlocking Information in Recorded Speech
jeadie | 2 years ago | on: After All Is Said and Indexed – Unlocking Information in Recorded Speech
jeadie | 2 years ago | on: After All Is Said and Indexed – Unlocking Information in Recorded Speech
jeadie | 2 years ago | on: Do you need a vector database?
1. Chunking long text fields in documents so as to get a better semantic vector for them (also you can only fit so much into an LLM). 2. Differently to 1. chunking long text fields (or even chunking images, audio, etc), is one way to perform highlighting. It helps to answer the question, for example, for a given document what about it was the reason it was returned? You can then point to the area in the image/text/audio that was most relevant. 3. You may want to run different LLMs on different fields (perhaps a separate multi-modal LLM vs a standard text LLM), or like another comment said have different transforms/representations of the same field.
Perhaps 100 vectors is non-standard, but definitely not unseen.
Open source at https://github.com/spiceai/spiceai