top | item 39830063

(no title)

kxyvr | 1 year ago

This is not entirely true. A typical trick for accomplishing this is to simply use the Hessian approximation `g'(x)*g'(x)` (star denotes adjoint), thus cutting off the second-derivative evaluation of g, and then use a modified CG method like Newton-Krylov for line-search methods or Steihaug-Toint for trust-region methods. It's very robust, matrix-free, and doesn't require an excessive amount of code to implement. Further, the method can be preconditioned with a positive definite preconditioner.

And, to be clear, there's no guarantee that a zero residual be found, only that `g'(x)*g(x)=0`, but that tends to be the nature of nonlinear equations.

discuss

ChrisRackauckas|1 year ago

> It's very robust, matrix-free, and doesn't require an excessive amount of code to implement. Further, the method can be preconditioned with a positive definite preconditioner.

It is not robust, in fact it's well-known as a numerically unstable method. g'(x)*g'(x)` is numerically unstable because it squares the condition number of the matrix. If you have a condition number of 1e-10, which is true for many examples in scientific problems (like the DFN battery model in the paper), the condition number of J'J is 1e-20, which effectively means a linear solve is unable to retrieve any digits of accuracy with 64-bit floating point numbers. So while I agree that you can use symmetric linear solvers as a small performance improvement in some cases, you have to be very careful when you do this as it is a very numerically unstable method. For some details, see https://math.stackexchange.com/a/2874902

Also, if you look at Figure 5 in the paper, you will see that there is a way to choose operator conditioning https://docs.sciml.ai/LinearSolve/stable/basics/OperatorAssu..., and if you tell it to assume the operator is well-conditioned it will actually use this trick. But we do not default to that because of the ill-conditioning.

kxyvr|1 year ago

Yes, care must be taken to stabilize the algorithm. However, I do not believe it to be true that no digits of accuracy can be obtained. The algorithm falls back to the Cauchy (steepest-descent) point on the first iteration and this will give reduction. That said, I'm willing to solve this particular problem and see. Where's your code for the DFN battery model? The page here leads to a dead github link:

https://help.juliahub.com/batteries/stable/api/#JuliaSimBatt...

avikpal1410|1 year ago

That is precisely how trust region methods work for nonlinear least squares problems and nonlinear systems. MINPACK, which both SciPy and Matlab use, defaults to this kind of TR scheme (without the matrix-free part). We do have TR comparisons in the paper.

Note that we don't however solve the normal form J'J system as that is generally bad (unless user opts in to doing so) and instead use least squares formulation which is more numeraically stable