top | item 45853907

(no title)

cschneid | 3 months ago

Yes this is the output speed. Code just flashes onto the page, it's pretty impressive.

They've claimed repeatedly in their discord that they don't quantize models.

The speed of things does change how you interact with it I think. I had this new GLM model hooked up to opencode as the harness with their $50/mo subscription plan. It was seriously fast to answer questions, although there are still big pauses in workflow when the per-minute request cap is hit.

I got a meaningful refactor done, maybe a touch faster than I would have in claude code + sonnet? But my human interaction with it felt like the slow part.

discuss

alyxya|3 months ago

The human interaction part is one of the main limitations to speed, where the more autonomous a model can be, the faster it is for me.