top | item 47161308

(no title)

Rochus | 4 days ago

Cool, I was at ETH when Modula-2 was en vogue, and we also had lectures where we programmed transputers in Occam-2.

In contrast to my experiences with e.g. Gemini 3 Pro, where it regularly happened that the LLM claimed to have reached full features scope in each iteration, but the result turned out to be full of stubs, Devin at least doesn't pull my leg and delivers what was agreed, but unfortunately debugging and fixing takes much more time than generating the initial version (about factor five). But so far I never tried to run an LLM project over such a long time as you did; must have cost a fortune.

discuss

dboreham|4 days ago

Cost me almost nothing in inference time (I have the monthly subscription), although if I had been paying myself at consulting rates it would have cost a few thousand for my time "LLM whispering" :) For clarity: I wasn't running the LLM for a month solid. I was on vacation in New Zealand -- I'd fire up the laptop in the AirBnB most nights and make Claude add a couple features, fix some bugs. Repeat rinse.

I find that it's uncannily like running a team of eager but not too experienced engineers: those humans would also show up claiming to have "finished". I'd say "well does it run so and so test ok?". They'd go away, come back a few days later... The LLM acts much the same. You have to keep it on a short leash but when it gets cracking on a problem it's amazing to watch. E.g. I saw it write countless test programs on the fly to diagnose a parser hang bug. It would try this and that, binary chopping on the problematic source file. If I was doing that myself I'd need a few strong coffees before diving in.