top | item 45384241

(no title)

personalityson | 5 months ago

Unless each iteration is 90% faster

discuss

order

amelius|5 months ago

This.

In fact, it can be slower because hardware is probably not optimized for the 1-bit case, so there may be a lot of low-hanging fruit for hardware designers and we may see improvements in the next iteration of hardware.

nlitened|5 months ago

Isn't digital (binary) hardware literally optimized for 1-bit case by definition?

nickpsecurity|5 months ago

FPGA's could be highly-competitive for models with unusual, but small, bit lengths. Especially single bits since their optimizers will handle that easily.

fxtentacle|5 months ago

In this paper, each iteration has to be slower. Because they need to calculate both their new method (which may be faster) and also the traditional method (because they need a float gradient). And old+new will always be slower than just old.