I'm not an AI person, in any technical sense. The savings being claimed, and I assume verified, are on ARM and x86 chips. The piece doesn't mention swapping mult to add, and a 1-bit LLM is, well, a 1-bit LLM.
Also,
> Additionally, it reduces energy consumption by 55.4% to 70.0%
With humility, I don't know what that means. It seems like some dubious math with percentages.
Not every instruction on a CPU or GPU uses the same amount of power. So if you could rewrite your algorithm to use more power efficient instructions (even if you technically use more of them), you can save overall power draw.
That said, time to market has been more important than any cares of efficiency for some time. Now and in the future, there is more of a focus on it as the expenses in equipment and power have really grown.
Run the non-quantized version of the model on your 3090/4090 gpu and observe the power draw. Then load the 1.58 model and observe the power usage. Sure, the numbers have a wide range because there are many gpu/npu to make the comparison.
Yeah, I get what you're saying but both are challenging the current MatMul methods. The L-Mul paper claims "a power savings of 95%" and that is the thread topic. Bitnet proves that at least 70% is possible by getting rid of MatMul.
kayo_20211030|1 year ago
Also,
> Additionally, it reduces energy consumption by 55.4% to 70.0%
With humility, I don't know what that means. It seems like some dubious math with percentages.
sroussey|1 year ago
That said, time to market has been more important than any cares of efficiency for some time. Now and in the future, there is more of a focus on it as the expenses in equipment and power have really grown.
Randor|1 year ago
I would start by downloading a 1.58 model such as: https://huggingface.co/HF1BitLLM/Llama3-8B-1.58-100B-tokens
Run the non-quantized version of the model on your 3090/4090 gpu and observe the power draw. Then load the 1.58 model and observe the power usage. Sure, the numbers have a wide range because there are many gpu/npu to make the comparison.
littlestymaar|1 year ago
Randor|1 year ago