(no title)
Fergusonb | 1 year ago
It still seems to me that these models are 'dumb' and often don't understand what I'm asking, where claude's intuition is much stronger.
I feel r1 14b even feels weaker than qwen 2.5 14b
Primary use-case is web technology / coding. Maybe I'm prompting it incorrectly?
Workaccount2|1 year ago
O1 or even O3 might be able to crack academic level math problems, but I still wouldn't trust it to correctly fill out a McDonalds application using a PDF of my resume and a calendar of my availability.
pclmulqdq|1 year ago
Havoc|1 year ago
It’s a bit like you get instruct tuned models and you get chat tuned ones. It’s not really one worse than the other just aimed at different uses
cpldcpu|1 year ago
Vibes are important in this case...