top | item 46768158

(no title)

Just a supplementary fact: I'm in the beneficial position, against the AI, that in a case where it's hard to provide that automatic feedback loop, I can run and test the code at my discretion, whereas the AI model can't.

Yet. Most of my criticism is not after running the code, but after _reading_ the code. It wrote code. I read it. And I am not happy with it. No even need to run it, it's shit at glance.

discuss

elevation|1 month ago

Yesterday I generated a for-home-use-only PHP app over the weekend with a popular cli LLM product. The app met all my requirements, but the generated code was mixed. It correctly used a prepared query to avoid SQL injection. But then, instead of an obvious:

    "SELECT * FROM table WHERE id=1;"

it gave me:

    $result = $db->query("SELECT * FROM table;");
    for ($row in $result)
        if ($["id"] == 1)
            return $row;

With additional prompting I arrived at code I was comfortable deploying, but this kind of flaw cuts into the total time-savings.

ReverseCold|1 month ago

> I can run and test the code at my discretion, whereas the AI model can't.

It sounds like you know what the problem with your AI workflow is? Have you tried using an agent? (sorry somewhat snarky but… come on)

GolDDranks|1 month ago

Yeah, you're right, and the snark might be warranted. I should consider it the same as my stupid (but cute) robot vacuum cleaner that goes at random directions but gets the job done.

The thing that differentiates LLM's from my stupid but cute vacuum cleaner, is that the (at least OpenAI's) AI model is cocksure and wrong, which is infinitely more infuriating than being a bit clueless and wrong.

__MatrixMan__|1 month ago

You might get better code out of it if you give the AI some more restrictive handcuffs. Spin up a tester instance and have it tell the developer instance to try again until it's happy with the quality.