top | item 42466286

(no title)

cubie | 1 year ago

Not yet - these are base models, or "foundational models". They're great for molding into different use cases via finetuning, better than common models like BERT, RoBERTa, etc. in fact, but like those models, these ModernBERT checkpoints can only do one thing: mask filling.

For other tasks, such as retrieval, we still need people to finetune them for it. The ModernBERT documentation has some scripts for finetuning with Sentence Transformers and PyLate for retrieval: https://huggingface.co/docs/transformers/main/en/model_doc/m... But people still need to make and release these models. I have high hopes for them.

discuss

order

No comments yet.