(no title)
alyxya
|
2 months ago
The hardest part about making a new architecture is that even if it is just better than transformers in every way, it’s very difficult to both prove a significant improvement at scale and gain traction. Until google puts in a lot of resources into training a scaled up version of this architecture, I believe there’s plenty of low hanging fruit with improving existing architectures such that it’ll always take the back seat.
tyre|2 months ago
You don't necessarily have to prove it out on large foundation models first. Can it beat out a 32b parameter model, for example?
swatcoder|2 months ago
While they do have lots of money and many people, they don't have infinite money and specifically only have so much hot infrastructure to spread around. You'd expect they have to gradually build up the case that a large scale experiment is likely enough to yield a big enough advantage over what's already claiming those resources.
p1esk|2 months ago
If Google is not willing to scale it up, then why would anyone else?
8note|2 months ago
nickpsecurity|2 months ago
So, I think they could default on doing it for small demonstrators.
m101|2 months ago
UltraSane|2 months ago
patapong|2 months ago