top | item 42897678 (no title) 1ba9115454 | 1 year ago I can't imagine this setup will get more than 1 token per second.I would love to see Deepseek running on premise with a decent TPS. discuss order hn newest thomquaid|1 year ago It says 4.25 TPS in the first para. ricardobeat|1 year ago Honest mistake. Some people think HN is just a series of short tweets and haven’t realized they are links yet! load replies (2) weatherlight|1 year ago That's still pretty slow, considering there's that "thinking" phase. load replies (1) october8140|1 year ago You can get 1t/s on a raspberry pi.https://youtu.be/o1sN1lB76EA?si=i8ecEBjLdV0zewFQ klohto|1 year ago this has nothing to do with the full 671B and the ollama models are distilled qwen2.5 load replies (1)
thomquaid|1 year ago It says 4.25 TPS in the first para. ricardobeat|1 year ago Honest mistake. Some people think HN is just a series of short tweets and haven’t realized they are links yet! load replies (2) weatherlight|1 year ago That's still pretty slow, considering there's that "thinking" phase. load replies (1)
ricardobeat|1 year ago Honest mistake. Some people think HN is just a series of short tweets and haven’t realized they are links yet! load replies (2)
weatherlight|1 year ago That's still pretty slow, considering there's that "thinking" phase. load replies (1)
october8140|1 year ago You can get 1t/s on a raspberry pi.https://youtu.be/o1sN1lB76EA?si=i8ecEBjLdV0zewFQ klohto|1 year ago this has nothing to do with the full 671B and the ollama models are distilled qwen2.5 load replies (1)
klohto|1 year ago this has nothing to do with the full 671B and the ollama models are distilled qwen2.5 load replies (1)
thomquaid|1 year ago
ricardobeat|1 year ago
weatherlight|1 year ago
october8140|1 year ago
https://youtu.be/o1sN1lB76EA?si=i8ecEBjLdV0zewFQ
klohto|1 year ago