(no title)
egorr | 2 years ago
btw, you can find the dataset with embeddings generated by all 3 mentioned: text-embedding-ada-002, all-MiniLM-L6-v2, and GTE-small on huggingface[0]
and big thanks to Stephan Sturges for his dataset[1]. we just extended his OpenAI ones and texts themselves with oss ones
[0] https://huggingface.co/datasets/Supabase/wikipedia-en-embedd...
[1] https://www.kaggle.com/datasets/stephanst/wikipedia-simple-o...
No comments yet.