Hi, I'm one of the creators of infinity, and the article has mentioned about the sparse vector vs bm25. While the sparse vector performs well under some evaluations, it is obtained by training a model, which means that it can't fully represent all of the user's keywords/tokens, and those that don't appear in the training set, are truncated. So this is a very big impact for many enterprise vertical scenarios. And bm25 doesn't have such a limitation
philippemnoel|1 year ago