top | item 46913863

(no title)

moregrist | 23 days ago

Perhaps you're comfortable with a compiler that generates different code every time you run it on the same source with the same libraries (and versions) and the same OS.

I am not. To me that describes a debugging fiasco. I don't want "semantic closure," I want correctness and exact repeatability.

discuss

candiddevmike|23 days ago

I wish these folks would tell me how you would do a reproducible build, or reproducible anything really, with LLMs. Even monkeying with temperature, different runs will still introduce subtle changes that would change the hash.

mvr123456|23 days ago

This reminds me of how you can create fair coins from biased ones and vice versa. You toss your coin repeatedly, and then get the singular "result" in some way by encoding/decoding the sequence. Different sequences might map to the same result, and so comparing results is not the same as comparing the sequences.

Meanwhile, you press the "shuffle" button, and code-gen creates different code. But this isn't necessarily the part that's supposed to be reproducible, and isn't how you actually go about comparing the output. Instead, maybe two different rounds of code-generation are "equal" if the test-suite passes for both. Not precisely the equivalence-class stuff parent is talking about, but it's simple way of thinking about it that might be helpful

cjbgkagh|23 days ago

There is nothing intrinsic to LLM prevents reproducibility. You can run them deterministically without adding noise, it would just be a lot slower to have a deterministic order of operations, which takes an already bad idea and makes it worse.

SecretDreams|23 days ago

Agree. I'm not sure what circle of software hell the OP is advocating for. We need consistent outputs from our most basic building blocks. Not performance probability functions. Many softwares run congruently across multiple nodes. What a nightmare it would be if you had to balance that for identical hardware.

pjmlp|22 days ago

That is exactly how JIT compilers work, you cannot guarantee 100% machines code generation across runs, unless you can reproduce the whole universe that lead to the same heuristics and decision tree.

raw_anon_1111|23 days ago

Once I create code with an LLM, the code is not going to magically change between runs because it was generated by an LLM unless it did an “#import chaos_monkey”