ethanahte's comments

ethanahte | 2 years ago | on: Do you need a vector database?

Yeah, depending on the model, calculating the 10 million embeddings could take longer sequentially, but, as you mention, it's also an embarrassingly parallel operation. I don't think that indexing can be performed in parallel, but I may be wrong on that one.

ethanahte | 3 years ago | on: Do you need a vector database?

Hi, author here. I totally agree with you that, for large scale, you're going to need a vector database. My hope is more to help people avoid scenarios like the one in this comment: https://news.ycombinator.com/item?id=35552303 Tangentially, I really like the approach that haystack has taken, where they allow you to slot in whichever document store you want, and that document store can scale from in-memory, to sqlite, to postgres, to pinecone https://docs.haystack.deepset.ai/docs/document_store

In terms of the one-time cost of indexing, you're totally right! Although, one thing to call out is that you will have to re-index every time you change your embedding model, such as for fine-tuning. I don't have a good handle on how prevalent this is, though.

ethanahte | 3 years ago | on: Do you need a vector database?

Hi, author here.

1. You make a great point about longer documents requiring multiple vectors which I should've mentioned in the post. Depending on your use case, this can certainly explode your dataset size! 2. Good to know about the pgvector limitations -- I haven't used it yet. 3. I guess "index" would be the more database-y term. That said, one thing I'll call out is that you have to re-index if you ever change your embedding model, and indexing can be slow. It took me ~20-30 minutes to index the 10 million embeddings in my benchmark.

ethanahte | 9 years ago | on: Ask HN: Who is hiring? (March 2017)

Dia&Co | Software Engineer, Product Manager, Data Scientist, and Data Analyst | New York, NY | Full-time, ONSITE, REMOTE

Dia&Co is the premier personal styling service for plus-size women. We’re looking for engineers, product, and data people to help create our suite of large consumer-facing and internal products that are transforming both operational efficiency and consumer e-commerce. We work with Ruby on Rails on the engineering side and Python on the data science side.

Please check out our tech blog to get an idea of what we think about and value: https://making.dia.com/

The interview process is a phone screen, a take home coding challenge, and finally an on-site interview. Apply here, and let us know that you found us on Hacker News: https://www.dia.co/careers

ethanahte | 9 years ago | on: Ask HN: Who is hiring? (February 2017)

Dia&Co | New York City or REMOTE | Software Engineer, Product Manager, Data Scientist, and Data Analyst | Full-time

Dia&Co is the premier personal styling service for plus-size women. We’re looking for engineers, product, and data people to help create our suite of large consumer-facing and internal products that are transforming both operational efficiency and consumer e-commerce. We work with Ruby on Rails on the engineering side and Python on the data science side.

Please check out our tech blog to get an idea of what we think about and value: https://making.dia.com/

The interview process is a phone screen, a take home coding challenge, and finally an on-site interview. Apply here, and let us know that you found us on Hacker News: https://www.dia.co/careers

ethanahte | 9 years ago | on: Ask HN: Who is hiring? (December 2016)

Dia&Co | New York City or REMOTE | Software Engineer, Product Manager, and Data Scientist | Full-time Dia&Co is the premier personal styling service for plus-size women. We’re looking for software engineers, product managers, and data scientists to help create our suite of large consumer-facing and internal products that are transforming both operational efficiency and consumer e-commerce. We work with Ruby on Rails on the engineering side and Python on the data science side. The interview process is a phone screen, a take home coding challenge, and finally an on-site interview. Apply here, and let us know that you found us on Hacker News: https://www.dia.co/careers

ethanahte | 9 years ago | on: Ask HN: Who is hiring? (November 2016)

Dia&Co | New York City or REMOTE | Software Engineer, Product Manager, and Data Scientist | Full-time

Dia&Co is the premier personal styling service for plus-size women.

We’re looking for software engineers, product managers, and data scientists to help create our suite of large consumer-facing and internal products that are transforming both operational efficiency and consumer e-commerce.

We work with Ruby on Rails on the engineering side and Python on the data science side.

The interview process is a phone screen, a take home coding challenge, and finally an on-site interview.

Apply here, and let us know that you found us on Hacker News: https://www.dia.co/careers

page 1