top | item 38444747 (no title) Bayes7 | 2 years ago Okay, I see that for inference. But for training it shouldn't matter because I need to hold on to all my activations for my backwards pass anyways? But yeah, fair point! discuss order hn newest No comments yet.
No comments yet.