(no title)
areddyyt | 1 year ago
To be more explicit, the weight matrix's values belong to the set of -1, 0, and 1. When using two bits to encode these weights, we are not effectively utilizing one possible state:
10 => 1, 01 => 0, 00 =>-1, 11 => ?
I think selecting the optimal radix economy will have more of a play on custom silicon, where we can implement silicon and instructions to rapidly decompress weights or work with the compressed weights directly.
No comments yet.