top | item 46380237

(no title)

impulser_ | 2 months ago

They don't need to catch up. They just need to be good enough and fast as fuck. Vast majority of useful tasks of LLMs has nothing to do with how smart they are.

GPT-5 models have been the most useless models out of any model released this year despite being SOTA, and it because it slow as fuck.

discuss

order

aschobel|2 months ago

For coding I don’t use any of the previous gen models anymore.

Ideally I would have both fast and SOTA; if I would have to pick one I’d go with SOTA.

There a report by OpenRouter on what folks tend to pay for it; it generally is SOTA in the coding domain. Folks are still paying a premium for them today.

There is a question if there is a bar where coding models are “good enough”; for myself I always want smarter / SOTA.

wyre|2 months ago

FWIW coding is one of the largest usages for LLM's where SOTA quality matters.

I think the bar for when coding models are "good enough" will be a tradeoff between performance and price. I could be using Cerebras Code and saving $50 a month, but Opus 4.5 is fast enough and I value the piece-of-mind I have knowing it's quality is higher than Cerebras' open source models to spend the extra money. It might take a while for this gap to close, and what is considered "good enough" will be different for every developer, but certainly this gap cannot exist forever.

gejose|2 months ago

> just need to be good enough and fast as fuck

Hard disagree. There are very few scenarios where I'd pick speed (quantity) over intelligence (quality) for anything remotely to do with building systems.

ssivark|2 months ago

If you thought a human working on something will benefit from being "agile" (building fast, shipping quickly, iterating, getting feedback, improving), why should it be any different from AI models?

Implicit in your claim are specific assumptions about how expensive/untenable it is to build systemic guardrails and human feedback, and specific cost/benefit ratio of approximate goal attainment instead of perfect goal attainment. Rest assured that there is a whole portfolio of situations where different design points make most sense.

jameshush|2 months ago

I agree with you for many use cases, but for the use case I'm focused on (Voice AI) speed is absolutely everything. Every millisecond counts for voice, and most voice use cases don't require anything close to "deep thinking. E.g., for inbound customer support use cases, we really just want the voice agent to be fast and follow the SOP.

CuriouslyC|2 months ago

Speed is great for UI iteration or any case where a human must be in the loop.

gessha|2 months ago

As long as the faster tech is reliable and I understand its quirks, I can work with it.

Aurornis|2 months ago

> They don't need to catch up. They just need to be good enough

The current SOTA models are impressive but still far from what I’d consider good enough to not be a constant exercise in frustration. When the SOTA models still have a long way to go, the open weights models have an even further gap distance to catch up.

nl|2 months ago

GPT 5 Codex is great - the best coding model around except maybe for Opus.

I'd like more speed but prefer more quality than more speed.

echelon|2 months ago

This. You can distill a foundation model into open source. The Chinese will be doing this for us for a long time.

We should be glad that the foundation model companies are stuck running on treadmills. Runaway success would be bad for everyone else in the market.

Let them sweat.

AmazingTurtle|2 months ago

I'd prefer a 30 minute response from GPT-5 over a 10 minute Response from {Claude/Google} <whatever their SOTA model is> (yes, even gemini 3)

Reason is: while these models look promising in benchmarks and seem very capable at an affordable price, I *strongly* felt that OpenAI models perform better most of the times. I had to cleanup Gemini mess or Claude mess after vibe coding too much. OpenAI models are just much more reliable with large scale tasks, organizing, chomping tasks one by one etc. That takes its time but the results are 100% worth it.

Demiurge|2 months ago

I get GPT 5.2 responses on copilot faster than for any other model, almost instantly. Are you sure they’re slow as fuck?

dontwannahearit|2 months ago

Confused. Is ‘fuck’ fast or slow? Or both at the same time? Is there a sort of quantum superposition of fuck?

867-5309|2 months ago

well, it's not slow as fuck! it's quick as lightning and speedy as hell