top | item 46352743

(no title)

Lapel2742 | 2 months ago

> I highly doubt a RTX 5090 can run anything that competes with Sonnet 3.5 which was released June, 2024.

I don't know about the capabilities of a 5090 but you probably can run a Devstral-2 [1] model locally on a Mac with good performance. Even the small Devstral-2 model (24b) seems to easily beat Sonnet 3.5 [2]. My impression is that local models have made huge progress.

Coding aside I'm also impressed by the Ministral models (3b, 8b and 14b) Mistral AI released a a couple of weeks ago. The Granite 4.0 models by IBM also seem capable in this context.

[1] https://mistral.ai/news/devstral-2-vibe-cli

[2] https://www.anthropic.com/news/swe-bench-sonnet

discuss

cmrdporcupine|2 months ago

Thing is you can pay basically fractions of cents a query to e.g. DeepSeek Platform or DeepInfra or Z.Ai or whatever and have them run the same open models for far cheaper and faster than you could ever build out at home.

It's neat to play with, but not practical.

The only story that I can see that makes sense for running at home is if you're going to fine tune a model by taking an open weight model and <hand waving> doing things to it and running that. Even then I believe there's places (hugging face?) that will host and run your updated model for cheaper than you could run it yourself.

Aurornis|2 months ago

> Even the small Devstral-2 model (24b) seems to easily beat Sonnet 3.5 [2].

I've played with Devstral 2 a lot since it came out. I've seen the benchmarks. I just don't believe it's actually better for coding.

It's amazing that it can do some light coding locally. I think it's great that we have that. But if I had to choose between a 2024-era model and Devstral 2 I'd pick the older Sonnet or GPTs any day.