top | item 47088506

(no title)

mythz | 11 days ago

I consider HuggingFace more "Open AI" than OpenAI - one of the few quiet heroes (along with Chinese OSS) helping bring on-premise AI to the masses.

I'm old enough to remember when traffic was expensive, so I've no idea how they've managed to offer free hosting for so many models. Hopefully it's backed by a sustainable business model, as the ecosystem would be meaningfully worse without them.

We still need good value hardware to run Kimi/GLM in-house, but at least we've got the weights and distribution sorted.

discuss

order

data-ottawa|11 days ago

Can we toss in the work unsloth does too as an unsung hero?

They provide excellent documentation and they’re often very quick to get high quality quants up in major formats. They’re a very trustworthy brand.

disiplus|11 days ago

Yeah, they're the good guys. I suspect the open source work is mostly advertisements for them to sell consulting and services to enterprises. Otherwise, the work they do doesn't make sense to offer for free.

cubie|11 days ago

I'm a big fan of their work as well, good shout.

Tepix|11 days ago

It's insane how much traffic HF must be pushing out of the door. I routinely download models that are hundreds of gigabytes in size from them. A fantastic service to the sovererign AI community.

razster|11 days ago

My fear is that these large "AI" companies will lobby to have these open source options removed or banned, growing concern. I'm not sure how else to explain how much I enjoy using what HF provides, I religiously browse their site for new and exciting models to try.

Onavo|10 days ago

Bandwidth is not that expensive. The Big 3 clouds just want to milk customers via egress. Look at Hetzner or CloudFlare R2 if you want to get get an idea of commodity bandwidth costs.

vardalab|11 days ago

Yup, I have downloaded probably a terabyte in the last week, especially with the Step 3.5 model being released and Minimax quants. I wonder what my ISP thinks. I hope they don't cut me off. They gave me a fast lane, they better let me use it, lol

zozbot234|11 days ago

> We still need good value hardware to run Kimi/GLM in-house

If you stream weights in from SSD storage and freely use swap to extend your KV cache it will be really slow (multiple seconds per token!) but run on basically anything. And that's still really good for stuff that can be computed overnight, perhaps even by batching many requests simultaneously. It gets progressively better as you add more compute, of course.

Aurornis|11 days ago

> it will be really slow (multiple seconds per token!)

This is fun for proving that it can be done, but that's 100X slower than hosted models and 1000X slower than GPT-Codex-Spark.

That's like going from real time conversation to e-mailing someone who only checks their inbox twice a day if you're lucky.

HPsquared|11 days ago

At a certain point the energy starts to cost more than renting some GPUs.

sowbug|11 days ago

Why doesn't HF support BitTorrent? I know about hf-torrent and hf_transfer, but those aren't nearly as accessible as a link in the web UI.

embedding-shape|11 days ago

> Why doesn't HF support BitTorrent?

Harder to track downloads then. Only when clients hit the tracker would they be able to get download states, and forget about private repositories or the "gated" ones that Meta/Facebook does for their "open" models.

Still, if vanity metrics wasn't so important, it'd be a great option. I've even thought of creating my own torrent mirror of HF to provide as a public service, as eventually access to models will be restricted, and it would be nice to be prepared for that moment a bit better.

Fin_Code|11 days ago

I still don't know why they are not running on torrent. Its the perfect use case.

heliumtera|11 days ago

How can you be the man in the middle in a truly P2P environment?

freedomben|11 days ago

That would shut out most people working for big corp, which is probably a huge percentage of the user base. It's dumb, but that's just the way corp IT is (no torrenting allowed).