(no title)
otabdeveloper4 | 13 hours ago
Huh? BERT is literally a language model that's small and uses attention.
And we had good language models before BERT too.
They were a royal bitch to train properly, though. Nowadays you can get the same with just 30 minutes of prompt engineering.
mootothemax|12 hours ago
Astute readers will note what’s been missed here.
Fascinating, really. Your confidently-statement yet factually void comments I’d have previously put down to one of the classic programmer mindsets. Nowadays though - where do I see that kind of thing most often? Curious.
ricericerice|12 hours ago
Also the irony of your comment when it in itself was confidently stated yet void of any content was not missed either - consider dropping the superiority complex next time.
krisoft|11 hours ago
I’m not astute enough to see what was missed here. Could you explain?