top | item 42850135

(no title)

If you want GGUF https://huggingface.co/unsloth/DeepSeek-R1-GGUF

Blog post about the dynamic gguf https://unsloth.ai/blog/deepseekr1-dynamic

Original deepseek can be of course found on hf as well https://huggingface.co/deepseek-ai

Here is an example how people run deepseek with cloud infrastructure that is not deepseeks https://www.youtube.com/watch?v=bOsvI3HYHgI

discuss

genewitch|1 year ago

we were talking about self-hosting. the deepseek-r1 is 347-713MB depending on quant. No one is running deepseek-r1 "locally, self hosted".

If people want to argue with me, i wish we'd all stick to what we're talking about, instead of saying "but you technically can if you use someone else's hardware" but that's not self hosted. I self host a deepseek-r1 distill, locally, on my computer.

It is deepseek, it's just been hand-distilled by someone using a different tool. the deepseek-r1 will get chopped down by 1/8th and it won't be called "deepseek-r1 - that's what they call a "foundational model", and then we'll see the 70B and the 30 and the 16 "deepseek deepseek distills"

next to no one who messes with this stuff uses foundational or distilled foundational models. Who's still using llama-3.2? Yeah, it's good, it's fine, but there's mixes and MoE and CoT that use llama as the base model, and they're better.

there is no gguf for running locally, self-hosted. Yes, if you have a DC card you can download the weights and run something but that's different than self-hosting local running with a 30B (for example).

hhh|1 year ago

I don't really understand what's different between self-hosting using Ollama vs self-hosting by running the full weights. I get that Ollama is easier, but you can still self-host the full one?