top | item 44653988 (no title) sourcecodeplz | 7 months ago Everyone keeps saying this but it is not really useful. Without a dedicated GPU & VRAM, you are waiting overnight for a response... The MoE models are great but they need dedicated GPU & VRAM to work fast. discuss order hn newest jychang|7 months ago Well, yeah, you're supposed to put in a GPU. It's a MoE model, the common tensors should be on the GPU, which also does prompt processing.The RAM is for the 400gb of experts.
jychang|7 months ago Well, yeah, you're supposed to put in a GPU. It's a MoE model, the common tensors should be on the GPU, which also does prompt processing.The RAM is for the 400gb of experts.
jychang|7 months ago
The RAM is for the 400gb of experts.