Well, I mean just choosing better names, don't touch the actual code. and you can also add a basic human filtering step if you want. You cannot possible say that "v12" is better than "header.size". I would argue that even hallucinated names are good: you should be able to think "but this position variable is not quite correctly updated, maybe this is not the position", which seems better than "this v12 variable is updated in some complicated way which I will ignore because it has no meaning".
In case the variable describes the number of input files, then v12 is better than header.size. How can you be sure that adding some LLM noise will provide actually accurate names?
i think for obj-c specifically (can’t speak to other langs) i’ve had a great experience. it does make little mistakes but ai oriented approach makes it faster/easier to find areas of interest to analyze or experiment with.
obj-c sendmsg use makes it more similar to understanding minified JS than decompiling static c because it literally calls many methods by string name.
If you ask an LLM to do a statically verifiable task without writing a simple verifier for it, and it hallucinates, that mistake is on you because it's a very quick step to guarantee something like this succeeds.
I mean, step 0 is verifying that the code with changed names actually compiles. But step 1, which is way more difficult, is ensuring that replacing v01 with out_file_idx or whatever, actually gives a more accurate description of the purpose of v01. Otherwise, what's the point of generating names if they have a 10% chance of misleading more than clarifying.
The code is already working with the v01, v02 names though. The use of LLMs here is intended to add information for humans to easier understand what the code does. Which might be worthwhile, but I think this AI upscaling of Obama pretty well illustrates the potential risks of trying to fill in information gaps without a proper understanding of the data https://x.com/Chicken3gg/status/1274314622447820801
empiricus|12 days ago
streetfighter64|12 days ago
jitl|12 days ago
obj-c sendmsg use makes it more similar to understanding minified JS than decompiling static c because it literally calls many methods by string name.
orbital-decay|12 days ago
sigseg1v|12 days ago
streetfighter64|12 days ago
Cthulhu_|12 days ago
streetfighter64|12 days ago