(no title)
kpw94 | 2 months ago
That's a good idea!
Curious about this, if you don't mind sharing:
- what's the stack ? (Do you run like llama.cpp on that rented machine?)
- what model(s) do you run there?
- what's your rough monthly cost? (Does it come up much cheaper than if you called the equivalent paid APIs)
clusterhacks|2 months ago
I am usually just running gpt-oss-120b or one of the qwen models. Sometimes gemma? These are mostly "medium" sized in terms of memory requirements - I'm usually trying unquantized models that will easily run on an single 80-ish gb gpu because those are cheap.
I tend to spend $10-$20 a week. But I am almost always prototyping or testing an idea for a specific project that doesn't require me to run 8 hrs/day. I don't use the paid APIs for several reasons but cost-effectiveness is not one of those reasons.
Juminuvi|2 months ago
bigiain|2 months ago