top | item 40512907

(no title)

fnbr | 1 year ago

The rule of thumb is roughly 44gb, as most models are trained in bf16, and require 16 bits per parameter, so 2 bytes. You need a bit more for activations, so maybe 50GB?

you need enough RAM and HBM (GPU RAM) so it’s a constraint on both.

discuss

sharbloop|1 year ago

Which GPU card can I buy to run this model? Can it run on commercial RTX3090 or does it need a custom GPU?

Havoc|1 year ago

3090 or 4090 will be able to run quantized 22B models.

Though realistically for code completion smaller models will be better due to speed

TechDebtDevin|1 year ago

Easy..

Novosell|1 year ago

Most GPUs still use GDDR I'm pretty sure, not HBM. Do you mean VRAM?