(no title)
mythz | 11 days ago
I'm old enough to remember when traffic was expensive, so I've no idea how they've managed to offer free hosting for so many models. Hopefully it's backed by a sustainable business model, as the ecosystem would be meaningfully worse without them.
We still need good value hardware to run Kimi/GLM in-house, but at least we've got the weights and distribution sorted.
data-ottawa|11 days ago
They provide excellent documentation and they’re often very quick to get high quality quants up in major formats. They’re a very trustworthy brand.
disiplus|11 days ago
swyx|10 days ago
danielhanchen|10 days ago
cubie|11 days ago
unknown|11 days ago
[deleted]
Tepix|11 days ago
razster|11 days ago
Onavo|10 days ago
vardalab|11 days ago
zozbot234|11 days ago
If you stream weights in from SSD storage and freely use swap to extend your KV cache it will be really slow (multiple seconds per token!) but run on basically anything. And that's still really good for stuff that can be computed overnight, perhaps even by batching many requests simultaneously. It gets progressively better as you add more compute, of course.
Aurornis|11 days ago
This is fun for proving that it can be done, but that's 100X slower than hosted models and 1000X slower than GPT-Codex-Spark.
That's like going from real time conversation to e-mailing someone who only checks their inbox twice a day if you're lucky.
HPsquared|11 days ago
sowbug|11 days ago
embedding-shape|11 days ago
Harder to track downloads then. Only when clients hit the tracker would they be able to get download states, and forget about private repositories or the "gated" ones that Meta/Facebook does for their "open" models.
Still, if vanity metrics wasn't so important, it'd be a great option. I've even thought of creating my own torrent mirror of HF to provide as a public service, as eventually access to models will be restricted, and it would be nice to be prepared for that moment a bit better.
Fin_Code|11 days ago
heliumtera|11 days ago
freedomben|11 days ago