Comparing fine tuning to editing binaries by hand is not a fair comparison. If I could show the decompiler some output I liked and it edited the binary for me to make the output match, then the comparison would be closer.
> If I could show the decompiler some output I liked and it edited the binary for me to make the output match, then the comparison would be closer.
That's fundamentally the same thing though - you run an optimization algorithm on a binary blob. I don't see why this couldn't work. Sure, a neural net is designed to be differentiable, while ELF and PE executables aren't, but then backprop isn't the be-all, end-all of optimization algorithms.
Off the top of my head, you could reframe the task as a special kind of genetic programming problem, one that starts with a large program instead of starting from scratch, and that works on an assembly instead of an abstract syntax tree. Hell, you could first decompile the executable and then have the genetic programming solver run on decompiled code.
I'd be really surprised if no one tried that before. Or, if such functionality isn't already available in some RE tools (or as a plugin for one). My own hands-on experience with reverse engineering is limited to a few attempts at adding extra UI and functionality to StarCraft by writing some assembly, turning it into object code, and injecting it straight into the running game process[0] - but that was me doing exactly what you described, just by hand. I imagine doing such things is common practice in RE that someone already automated finding the specific parts of the binary that produce the outputs you want to modify.
--
[0] - I sometimes miss the times before Data Execution Prevention became a thing.
The question is not, whether it is ideal to do some ML tasks with it, the question is, whether you can do the things you could typically do with open sourced software, including looking at the source and build it, or modify the source and build it. If you don't have the original training data, or mechanism of getting the training data, the compiled result is not reproducible, like normal code would be, and you cannot make a version saying for example: "I want just the same, but without it ever learning from CCP prop."
It is a fair comparison. Normal programming takes inputs and a function and produces outputs. Deep learning takes inputs and outputs and derives a functions. Of course the decompilers for traditional programs do not work on inputs and outputs, it is a different paradigm!
TeMPOraL|1 year ago
That's fundamentally the same thing though - you run an optimization algorithm on a binary blob. I don't see why this couldn't work. Sure, a neural net is designed to be differentiable, while ELF and PE executables aren't, but then backprop isn't the be-all, end-all of optimization algorithms.
Off the top of my head, you could reframe the task as a special kind of genetic programming problem, one that starts with a large program instead of starting from scratch, and that works on an assembly instead of an abstract syntax tree. Hell, you could first decompile the executable and then have the genetic programming solver run on decompiled code.
I'd be really surprised if no one tried that before. Or, if such functionality isn't already available in some RE tools (or as a plugin for one). My own hands-on experience with reverse engineering is limited to a few attempts at adding extra UI and functionality to StarCraft by writing some assembly, turning it into object code, and injecting it straight into the running game process[0] - but that was me doing exactly what you described, just by hand. I imagine doing such things is common practice in RE that someone already automated finding the specific parts of the binary that produce the outputs you want to modify.
--
[0] - I sometimes miss the times before Data Execution Prevention became a thing.
sitkack|1 year ago
_ea1k|1 year ago
Seriously, a set of weights that already works really well is basically the ideal basis for a _lot_ of ML tasks.
zelphirkalt|1 year ago
dartos|1 year ago
Especially with dynamically linked binaries like many games.
carom|1 year ago