top | item 35010452 (no title) rihegher | 3 years ago I would be surprised if you can't. The smallest weight file is 14gb apparently discuss order hn newest eightysixfour|3 years ago https://github.com/facebookresearch/llama/blob/main/FAQ.md#3Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs. MacsHeadroom|3 years ago In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...
eightysixfour|3 years ago https://github.com/facebookresearch/llama/blob/main/FAQ.md#3Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs. MacsHeadroom|3 years ago In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...
MacsHeadroom|3 years ago In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...
eightysixfour|3 years ago
Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs.
MacsHeadroom|3 years ago