I don't know that, mostly speculating based on how mixture of experts is outperforming decoder-only architectures, which means we're already composing transformers to squeeze the most out of it, and still it seems to fall short. They have already been trained with incredible amounts of data, and it still needs to be composed into multiple instances and needs even better hardware and it seems it has reached diminishing returns. The question is will the little that is left to optimize be enough to have it be truly agentic and create full apps on its own, or will it still require expert supervision for anything useful.
proc0|1 year ago