top | item 46894694 (no title) MuffinFlavored | 25 days ago Thank you. Could you give a tl;dr on "the full model needs ____ this much VRAM and if you do _____ the most common quantization method it will run in ____ this much VRAM" rough estimate please? discuss order hn newest omneity|25 days ago It’s a trivial calculation to make (+/- 10%).Number of params == “variables” in memoryVRAM footprint ~= number of params * size of a paramA 4B model at 8 bits will result in 4GB vram give or take, same as params. At 4 bits ~= 2GB and so on. Kimi is about 512GB at 4 bits.
omneity|25 days ago It’s a trivial calculation to make (+/- 10%).Number of params == “variables” in memoryVRAM footprint ~= number of params * size of a paramA 4B model at 8 bits will result in 4GB vram give or take, same as params. At 4 bits ~= 2GB and so on. Kimi is about 512GB at 4 bits.
omneity|25 days ago
Number of params == “variables” in memory
VRAM footprint ~= number of params * size of a param
A 4B model at 8 bits will result in 4GB vram give or take, same as params. At 4 bits ~= 2GB and so on. Kimi is about 512GB at 4 bits.