top | item 43759919

(no title)

rjknight | 10 months ago

I think it depends on whether you think there's low-hanging fruit in making the ML stack more efficient, or not.

LLMs are still somewhat experimental, with various parts of the stack being new-ish, and therefore relatively un-optimised compared to where they could be. Let's say we took 10% of the training compute budget, and spent it on an army of AI coders whose job is to make the training process 12% more efficient. Could they do it? Given the relatively immature state of the stack, it sounds plausible to me (but it would depend a lot on having the right infrastructure and practices to make this work, and those things are also immature).

The bull case would be the assumption that there's some order-of-magnitude speedup available, or possibly multiple such, but that finding it requires a lot of experimentation of the kind that tireless AI engineers might excel at. The bear case is that efficiency gains will be small, hard-earned, or specific to some rapidly-obsoleting architecture. Or, that efficiency gains will look good until the low-hanging fruit is gone, at which point they become weak again.

discuss

quonn|10 months ago

It may sound plausible, but the actual computations are very simple, dense and highly optimised already. The model itself has room for improvements, but this is not necessarily something that an engineer can do, it requires research.

fc417fc802|10 months ago

> very simple, dense and highly optimised already

Simple and dense, sure. Highly optimized in a low level math and hardware sense but not in a higher level information theoretic sense when considering the model as a whole.

Consider that quantization and compression techniques can achieve on the order of 50% size reduction. That strongly suggests to me that current models aren't structured in a very efficient manner.