top | item 44471361

(no title)

niux | 8 months ago

How is it more expensive?

discuss

Fancy hardware with bespoke production process, smaller economies of scale, utilization probably not that great since they are user-speed positioning and purportedly under-invested in their compiler, which has a hard job compiling for such an arch anyways. Ignoring for the moment the cost for their bespoke software stack, which they can probably amortize away eventually.

totaa|7 months ago

according to OpenRouter, Cerebras charges $0.65/$0.85 for 1m input/output tokens for Llama 4 Scout. Google charges $0.25/$0.70; lambda.ai charges $0.08/$0.30 for the same model.