(no title)
nyrikki | 11 days ago
Same with unsloth/gpt-oss-120b-GGUF:F16 gets 25 tps and gpt-oss20b gets 195 tps!!!
The advantage is that you can use the APU for booting, and pass through the GPU to a VM, and have nice safer VMs for agents at the same time while using DDR4 IMHO.
lambda|11 days ago
nyrikki|11 days ago
I won’t use a public model for my secret sauce, no reason to help the foundation models on my secret sauce.
Even an old 1080ti works well for FIM for IDEs.
IMHO the above setup works well for boilerplate and even the sota models fail for the domain specific portions.
While I lucked out and foresaw the huge price increases, you can still find some good deals. Old gaming computers work pretty well, especially if you have Claude code locally churn on the boring parts while you work on the hard parts.