top | item 46351384

(no title)

Myrmornis | 2 months ago

Can anyone give any tips for getting something that runs fairly fast under ollama? It doesn't have to be very intelligent.

When I tried gpt-oss and qwen using ollama on an M2 Mac the main problem was that they were extremely slow. But I did have a need for a free local model.

discuss

order

parthsareen|2 months ago

How much ram are you running with? Qwen3 and gpt-oss:20b punch a good bit above their weight. Personally use it for small agents.

am17an|2 months ago

Use llama.cpp? I get 250 toks/sec on gpt-oss using a 4090, not sure about the mac speeds