top | item 40488772

(no title)

justinnk | 1 year ago

(I am one of the authors) Generally speaking, the latter. The purpose of DiscoGrad is just to deliver useful gradients. These provide information about the local behavior of the cost function around the currently evaluated point to an optimizer of your choice, e.g., gradient descent. Interestingly, the smoothing and noise can sometimes prevent getting stuck in undesired (shallow) local minima when using gradient descent.

discuss

order

pthr|1 year ago

Thanks for sharing your insight, appreciated! Also your final remark.