top | item 36050277

(no title)

Vasyl_R | 2 years ago

hello Paul, thanks a lot for your feedback. I really appreciate that and we'll take it into consideration for our next steps. We saw people struggling with vector search that retrieves half of the relevant paragraph, just because it was chunked base on the qty of tokens. So our first step is to give users (I'm not talking about people that know Python, NLTK, and LangChain) can pre-process their embeddings, adding there images, and making cleansing, removing at least punctuations and stop-words, with a few clicks. But you're totally right - now we have to think not about a single document pre-processing but about embedding large set of documents.

Really appreciate your time and hope to have your star or see you among our watchers.

discuss

No comments yet.