At my work, we still prefer to use distilbert for text classification. It almost always does well with a little bit of fine tuning. In very rare cases, we use LLMs/Agentic setup when the task involves refering both images and text and the same time.
I can confirm that Distillbert has worked well when I have used it for classification, especially on shortish sequences. I'm really interested in trying out ModernBert, or a smaller variant due to the larger context window (8192 tokens).
I was thinking of trying ModernBERT for one of my projects. But I can only conclude after seeing the performance for my usecase. Do you think ModernBERT will be capable of expanding abbreviated sentences?
coder68|6 months ago
siddheshgunjal|6 months ago