(no title)
abra0 | 2 years ago
Agreed on reliability and data transfer, that's a good point.
Out of curiosity, what do you use a 2x3090 rig for? Bulk not time-sensitive inference on down quanted models?
abra0 | 2 years ago
Agreed on reliability and data transfer, that's a good point.
Out of curiosity, what do you use a 2x3090 rig for? Bulk not time-sensitive inference on down quanted models?
imiric|2 years ago
If you're using them for inference, your usage pattern is unpredictable. I could spend hours between having to use it, or minutes. If you shut it down and release it, the host might be gone the next time you want to use it.
> what do you use a 2x3090 rig for? Bulk not time-sensitive inference on down quanted models?
Yeah. I can run 7B models unquantized, ~13-33B at q8, and ~70B at q4, at fairly acceptable speeds (>10tk/s).
whimsicalism|2 years ago