Four bits per parameter. (A parameter is what you call an integer here.)
I was skeptical of it for some time, but it seems to work because individual parameters don’t encode much information. The knowledge is embedded thanks to having a massive number of low bit parameters.
Kranar|2 years ago
If it means that weights for an LLM can be 4 bits well that's just mind boggling.
sillysaurusx|2 years ago
I was skeptical of it for some time, but it seems to work because individual parameters don’t encode much information. The knowledge is embedded thanks to having a massive number of low bit parameters.
justahuman74|2 years ago