top | item 33123489

(no title)

hackpert | 3 years ago

While your point about numerical stability is correct in general, there are no numerical stability issues here and I think this conception, which I've seen in more than one place now, stems from a fundamental misunderstanding of the paper's results. While they _did_ come up with a faster TPU/GPU algorithm too, the primary result is not a fast matmul approximation, it is an exact algorithm comprising of stepwise addition/multiplication operations, and hence is numerically stable and should work for any ring (https://ncatlab.org/nlab/show/ring). AlphaTensor itself does not do the matrix multiplication, it was used to perform an (efficiently pruned) tree search over the space of operations to find an efficient, stable algorithm.

discuss

zekrioca|3 years ago

Directly from the paper’s "Discussion" section:

> One important strength of AlphaTensor is its flexibility to support complex stochastic and non-differentiable rewards (from the tensor rank to practical efficiency on specific hardware), in addition to finding algorithms for custom operations in a wide variety of spaces (such as finite fields). We believe this will spur applications of AlphaTensor towards designing algorithms that optimize metrics that we did not consider here, such as numerical stability or energy usage.

hackpert|3 years ago

Right, but doesn't that mean that it could potentially be used for designing algorithms that have componentwise numerical stability over some kind of floating point standard, but this, by definition being a result over finite fields, should be numerically stable?

(apologies if I misunderstood, I wasn't calling you out specifically but a generalized misconception I've noticed in a lot of other discussions so far)