top | item 45791775 (no title) samsartor | 3 months ago Yes. Pretraining and fine-tuning use standard Adam optimizers (usually with weight-decay). Reinforcement learning has been the odd-man out historically, but these days almost all RL algorithms also use backprop and gradient descent. discuss order hn newest No comments yet.
No comments yet.