top | item 46641179

(no title)

throwaway198846 | 1 month ago

I lately used these methods and BFGS worked better than CG for me.

discuss

Absolutely plausible (BFGS is awesome), but this is situation dependent (no free lunch and all that). In the context of training neural networks, it gets even more complicated when one takes implicit regularisation coming from the optimizer into account. It's often worthwhile to try a SGD-type optimizer, BFGS, and a Newton variant to see which type works best for a particular problem.