(no title)
Metacelsus | 17 days ago
Google has definitely been pulling ahead in AI over the last few months. I've been using Gemini and finding it's better than the other models (especially for biology where it doesn't refuse to answer harmless questions).
CuriouslyC|17 days ago
throwup238|17 days ago
neilellis|17 days ago
NitpickLawyer|17 days ago
IMO it's the other way around. Benchmarks only measure applied horse power on a set plane, with no friction and your elephant is a point sphere. Goog's models have always punched over what benchmarks said, in real world use @ high context. They don't focus on "agentic this" or "specialised that", but the raw models, with good guidance are workhorses. I don't know any other models where you can throw lots of docs at it and get proper context following and data extraction from wherever it's at to where you'd need it.
scarmig|17 days ago
Usually, when you decrease false positive rates, you increase false negative rates.
Maybe this doesn't matter for models at their current capabilities, but if you believe that AGI is imminent, a bit of conservatism seems responsible.
nkzd|17 days ago
Davidzheng|17 days ago
verdverm|17 days ago
simianwords|17 days ago