top | item 38444747

(no title)

Bayes7 | 2 years ago

Okay, I see that for inference. But for training it shouldn't matter because I need to hold on to all my activations for my backwards pass anyways? But yeah, fair point!

discuss

order

No comments yet.