(no title)
ZaneHam | 2 months ago
Some interesting results:
93.8% energy reduction per inference, 16x memory compression (7B model: 28GB → 1.75GB), Zero floating-point multiplication, Runs on CPUs, no GPU required and Architectural epistemic uncertainty (it won't hallucinate what it doesn't know)
Repo: https://github.com/Zaneham/Ternary_inference
Happy to answer questions :-) Happy holidays and merry christmas!
mika6996|2 months ago
ZaneHam|2 months ago