top | item 42898307

(no title)

mayukhdeb | 1 year ago

In this paper, we don't zero out the weights. We remove them.

discuss

order

vlovich123|1 year ago

Thanks for the correction! Can it be retrofitted into existing models through distillation or do you have to train the model from scratch?