top | item 42899424 (no title) bick_nyers | 1 year ago It will be slower for a 70b model since Deepseek is an MoE that only activates 37b at a time. That's what makes CPU inference remotely feasible here. discuss order hn newest No comments yet.
No comments yet.