top | item 44534330 (no title) martin_ | 7 months ago how do you low cost run a 1T param model? discuss order hn newest maven29|7 months ago 32B active parameters with a single shared expert. JustFinishedBSG|7 months ago This doesn’t change the VRAM usage, only the compute requirements. load replies (2)
maven29|7 months ago 32B active parameters with a single shared expert. JustFinishedBSG|7 months ago This doesn’t change the VRAM usage, only the compute requirements. load replies (2)
JustFinishedBSG|7 months ago This doesn’t change the VRAM usage, only the compute requirements. load replies (2)
maven29|7 months ago
JustFinishedBSG|7 months ago