top | item 42015282

(no title)

tomthe | 1 year ago

Nice introduction, but I think that ranking the models purely by their input token limits is not a useful exercise. Looking at the MTEB leaderboard is better (although a lot of the models are probably overfitting to their test set).

This is a good time to chill for my visualization of 5 Millionembeddings of HN posts, users and comments: https://tomthe.github.io/hackmap/

discuss

kaycebasques|1 year ago

Thanks, a couple other people gave me this same feedback in another comment thread and it definitely makes sense not to overindex on input token size. Will update that section in a bit.