top | item 35010452

(no title)

rihegher | 3 years ago

I would be surprised if you can't. The smallest weight file is 14gb apparently

discuss

Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs.

In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...