(no title)
otabdeveloper4 | 17 hours ago
The entire point of LLMs is that you don't have to spend money training them for each specific case. You can train something like Qwen once and then use it to solve whatever classification/summarization/translation problem in minutes instead of weeks.
mootothemax|13 hours ago
BERT isn’t a SLM, and the original was released in 2018.
The whole new era kicked off with Attention Is All You Need; we haven’t reached even a single decade of work on it.
otabdeveloper4|13 hours ago
Huh? BERT is literally a language model that's small and uses attention.
And we had good language models before BERT too.
They were a royal bitch to train properly, though. Nowadays you can get the same with just 30 minutes of prompt engineering.
znnajdla|16 hours ago
I don’t agree. I would say the entire point of LLMs is to be able to solve a certain class of non-deterministic problems that cannot be solved with deterministic procedural code. LLMs don’t need to be generally useful in order to be useful for specific business use cases. I as a programmer would be very happy to have a local coding agent like Claude Code that can do nothing but write code in my chosen programming language or framework, instead of using a general model like Opus, if it could be hyper-specialized and optimized for that one task, so that it is small enough to run on my MacBook. I don’t need the other general reasoning capabilities of Opus.
lanstin|7 hours ago
swiftcoder|16 hours ago
You are confusing LLMs with more general machine learning here. We've been solving those non-deterministic problems with machine learning for decades (for example, tasks like image recognition). LLMs are specifically about scaling that up and generalising it to solve any problem.