top | item 44763219

(no title)

sophia01 | 7 months ago

My understanding is that the coding agents people use can be modified to plug into any LLM provider's API?

The difference here seems to be that Cerebras does not appear to have Qwen3-Coder through their API! So now there is a crazy fast (and apparently good too?) model that they only provide if you pay the crazy monthly sub?

discuss

social_quotient|7 months ago

Exactly! You can use tools like https://github.com/musistudio/claude-code-router which let you use other LLMs.

The way I would use this $50 Cerebras offering is as a delegate for some high token count items like documentation, lint fixing, and other operations as a way not only to speed up the workflow but to release some back pressure on Anthropic/claude so you don’t hit your limits as quickly… especially with the new weekly throttle coming. This $50 dollar jump seems very reasonable, now for the 1k completions a day, id really want to see and get a feel for how chatty it is.

I suppose thats how it starts but id the model is competent and fast, the speed alone might force you a bit to delegate more to it. (Maybe sub agent tasks)

pxc|7 months ago

You can still get it pay-as-you-go on OpenRouter, afaict, and the billing section of the Cerebras Cloud account I just created has a section for Qwen3-Coder-480B as well.

sophia01|7 months ago

Yeah just checked apparently it is available as a preview (not on main models/pricing page).

baq|7 months ago

define 'crazy'.

it's two kilotokens per second. that's fast.

bangaladore|7 months ago

It's more than 10x faster than the fastest alternative. And roughly 50x the average alternative.

Certainly, somewhere between fast and crazy.

amelius|7 months ago

It generates code faster than I can inspect it.

In other words, it's needlessly fast.

ttoinou|7 months ago

I’d say super fast