top | item 44483619

(no title)

bobbob27 | 7 months ago

The cost to improve training increases exponentially for every milestone. No vendor is even coming close to recouping the costs now. Not to mention quality data to feed the training.

The R&D is running on hopes that increasing the magnitude (yes, actual magnitudes) of their models will eventually hit a miracle that makes their company explode in value and power. They can't explain what that could even look like... but they NEED evermore exorbitant amounts of funding flowing in.

This truly isn't a normal ratio of research-to-return.

Luckily, what we do have already is kinda useful and condensing models does show promise. In 5 years I doubt we'll have the post-labor dys/utopia we're being hyped up for. But we may have some truly badass models that can run directly on our phones.

Like you said, Llama and local inference is cheap. So that's the most logical direction all of this is taking us.

discuss

TeMPOraL|7 months ago

Nah, the vendors have generally been open about the limits of scaling. The bet isn't on that one last order of magnitude increase will hit a miracle - the bet is on R&D figuring out a new way to get better model performance before the last one hits diminishing returns. Which, for now, is what's been consistently happening.

There's risk to that assumption, but it's also a reasonable one - let's not forget the whole field is both new and has seen stupid amounts of money being pumped into it over the last few years; this is an inflationary period, there's tons of people researching every possible angle, but that research takes time. It's a safe bet that there are still major breakthroughs ahead us, to be achieved within the next couple years.

The risky part for the vendors is whether they'll happen soon enough so they can capitalize on them and keep their lead (and profits) for another year or so until the next breakthrough hits, and so on.