top | item 36984892

(no title)

egorr | 2 years ago

heya, supabase engineer here, i coauthored the part about benchmarking different models with pgvector to compare QPS when using fewer dims models.

btw, you can find the dataset with embeddings generated by all 3 mentioned: text-embedding-ada-002, all-MiniLM-L6-v2, and GTE-small on huggingface[0]

and big thanks to Stephan Sturges for his dataset[1]. we just extended his OpenAI ones and texts themselves with oss ones

[0] https://huggingface.co/datasets/Supabase/wikipedia-en-embedd...

[1] https://www.kaggle.com/datasets/stephanst/wikipedia-simple-o...

discuss

order

No comments yet.