top | item 44702937

(no title)

On Friday I was converting a constrained solver from python to another language, and ran into some difficulty with subsituting an optimzer that's a few lines of easily written Scipy; but barely being supported in another language. One AI tool found this out and fully re-implemented the solver using a custom linear algebra library it wrote from scratch. But another AI tool was really struggling with getting the right syntax to be compatible with the common existing optimization libaries, and I felt like I was repeatedly putting queries (read: $) into the software equivalent of a slot machine that was constantly apologizing for not giving a testable answer while eating tens of dollars in direct costs waiting for the "jackpot" of working code.

The feedback loop of "maybe the next time it'll be right" turned into a few hundred queries resulting in finding the LLM's attempts were a ~20 node cycle of things it tried and didn't work, and now you're out a couple dollars and hours of engineering time.

discuss

moregrist|7 months ago

> One AI tool found this out and fully re-implemented the solver using a custom linear algebra library it wrote from scratch.

So slow, untested, and likely buggy, especially as the inputs become less well-conditioned?

If this was a jr dev writing code I’d ask why they didn’t use <insert language-relevant LAPACK equivalent>.

Neither llm outcome seems very ideal to me, tbh.

theshrike79|7 months ago

With mathematical things you can always write comprehensive and complete unit tests to check the AIs work.

TDD (and exhaustive unit tests in general) are a good idea with LLMs anyway. Just either tell it not to touch test, or in Claude's case you can use Hooks to _actually_ prevent it from editing any test file.

Then shove it at the problem and it'll iterate a solution until the tests pass. It's like the Excel formula solver, but for code :D

brookst|7 months ago

A very relatable experience. But not all that different from how humans work when in unfamiliar domains.

leptons|7 months ago

I'd rather work with a human. Even with our flaws, it's still better than constantly being lied to by a tin can. If a junior kept delivering broken results as much as the "AI" does, they wouldn't be on my team that long.

th0ma5|7 months ago

Except... Completely different