top | item 42468121

(no title)

neodypsis | 1 year ago

How does it compare to Jina V3 [0], which also has 8192 context length?

0. https://arxiv.org/abs/2409.10173

discuss

order

bclavie|1 year ago

They perform different roles, so they're not directly comparable.

Jina V3 is an embedding model, so it's a base model, further fine-tuned specifically for embedding-ish tasks (retrieval, similarity...). This is what we call "downstream" models/applications.

ModernBERT is a base model & architecture. It's not supposed to be out of the box, but fine-tuned for other use-cases, serving as their backbone. In theory (and, given early signal, most likely in practice too), it'll make for really good downstream embeddings once people build on top of it!