(no title)
HumanOstrich | 18 days ago
If instead the model is performing worse due to how much they had to shrink it just so it will fit on Cerebras hardware, then we might be in for a long wait for the next gen of ginormous chips.
HumanOstrich | 18 days ago
If instead the model is performing worse due to how much they had to shrink it just so it will fit on Cerebras hardware, then we might be in for a long wait for the next gen of ginormous chips.
postalcoder|18 days ago
I need to incorporate "risk of major failure" into bluey bench. Spark is a dangerous model. It doesnt strongly internalize the consequences of the commands that it runs, even on xhigh. As a result I'm observing a high tendency to run destructive commands.
For instance, I asked it to assign random numbers to the filename of the videos in my folder to run the bm. It accidentally deleted the files on most of the runs. The funniest part about it is that it comes back to you within a few seconds and says something like "Whoops, I have to keep it real, I just deleted the files in your folder."
HumanOstrich|18 days ago
jychang|18 days ago
They really should have just named it "gpt-5.3-codex-mini" (served by Cerebras). It would have made it clear what this model really is.
HumanOstrich|18 days ago
We don't know how they integrate with Cerebras hardware, but typically you'd pay a few million dollars to get the hardware in your own datacenter. So no, "served by Cerebras" is confusing and misleading.
Also "mini" is confusing because it's not analagous to gpt-5.1-codex vs gpt-5.1-codex-mini. Gpt-5.3-codex-spark is a unique, _experimental_ offering that doesn't fit the existing naming suffixes.
I don't understand what's wrong with "spark". It's friendly and evokes a sense of something novel, which is perfect.
If you want to know more about the model, read the first paragraph of the article. That information doesn't need to be hardcoded into the model name indefinitely. I don't see any "gpt-5.3-codex-nvidia" models.