top | item 34037499

(no title)

I should add, Riley used the ada embedding model (rather than sentence transformers). Performance wise they should be similar (in ability to encode meaning accurately) but the ada model can encode a much larger chunk of text. I don't know exact numbers but something like 1-2 pages of text in a typical corporate PDF. Whereas sentence transformers are typically limited to around a paragraph of text.

Typically you'd split the text in paragraph sized chunks to handle this requirement of sentence transformers, with GPT-3 embeddings you naturally have more flexibility there

discuss

No comments yet.