(I am one of the authors) Generally speaking, the latter. The purpose of DiscoGrad is just to deliver useful gradients. These provide information about the local behavior of the cost function around the currently evaluated point to an optimizer of your choice, e.g., gradient descent. Interestingly, the smoothing and noise can sometimes prevent getting stuck in undesired (shallow) local minima when using gradient descent.
pthr|1 year ago