(no title)
rjknight | 10 months ago
LLMs are still somewhat experimental, with various parts of the stack being new-ish, and therefore relatively un-optimised compared to where they could be. Let's say we took 10% of the training compute budget, and spent it on an army of AI coders whose job is to make the training process 12% more efficient. Could they do it? Given the relatively immature state of the stack, it sounds plausible to me (but it would depend a lot on having the right infrastructure and practices to make this work, and those things are also immature).
The bull case would be the assumption that there's some order-of-magnitude speedup available, or possibly multiple such, but that finding it requires a lot of experimentation of the kind that tireless AI engineers might excel at. The bear case is that efficiency gains will be small, hard-earned, or specific to some rapidly-obsoleting architecture. Or, that efficiency gains will look good until the low-hanging fruit is gone, at which point they become weak again.
quonn|10 months ago
fc417fc802|10 months ago
Simple and dense, sure. Highly optimized in a low level math and hardware sense but not in a higher level information theoretic sense when considering the model as a whole.
Consider that quantization and compression techniques can achieve on the order of 50% size reduction. That strongly suggests to me that current models aren't structured in a very efficient manner.