top | item 39467741 (no title) memossy | 2 years ago 800m is good for mobile, 8b for graphics cards.Bigger than that is also possible, not saturated yet but need more GPUs. discuss order hn newest anon373839|2 years ago Do you know how the memory demands compare to LLMs at the same number of parameters? For example, Mistral 7B quantized to 4 bits works very well on an 8GB card, though there isn’t room for long context. vorticalbox|2 years ago you ca also quantisation which lowers memory requirements at a small lose of performance.
anon373839|2 years ago Do you know how the memory demands compare to LLMs at the same number of parameters? For example, Mistral 7B quantized to 4 bits works very well on an 8GB card, though there isn’t room for long context.
vorticalbox|2 years ago you ca also quantisation which lowers memory requirements at a small lose of performance.
anon373839|2 years ago
vorticalbox|2 years ago