top | item 40857004

(no title)

osipov | 1 year ago

What's your basis for claiming that Tinygrad can't compute 2nd order partial derivatives (i.e. Hessians) needed for LBFGS? Tinygrad like PyTorch uses automatic differentiation which has no problem supporting nth order derivatives.

discuss

order

fjkdlsjflkds|1 year ago

OP does not (seemingly) claim that tinygrad can't compute hessians, only that a first-order optimization method was the only thing tried.

Also, as a quasi-newton method, L-BFGS does not require explicit (pre-)computation of the hessian (it implicitly iteratively estimates its inverse in an online manner).

thesz|1 year ago

As someone with highly unpronoceable nickname said, my only complaint is that only first order methods are used.

Second order methods are fun, actually. I like them. ;)