Back in the time of FIDO, I've accepted the challenge to crack a program (find a correct password) that more or less consisted of a loop to simulate three-address MOV instruction.
The loop jump address sometimes changed for some effectful operations like printing or for optimizations like executing addition.
It took me about four hours to find the correct password. In the course of there three hours I wrote 1) an executor that used i386 debug registers to look for current MOV addresses, 2) a tracer that produced a trace and 3) a compactor which identified common instruction sequences and presented them as some macrocommand. It turned out the original source code has used macros in the opposite way. The final challenge was to write brute force password finder, which is not that hard at all (for 32-bit checksum).
All in x86 assembler. I guess it was about 1995-96, somewhere there.
Now I'd use the same technique, but on higher level. Instead of peephole compacting I'd use graph analysis, but that's about it. You can get pretty much everything from the program trace, I think this way you can get even more information than from disassembly.
So in my opinion, it is one hell of a cool experiment. But try not to use it as a real obfuscation device.
I also remember seeing a "forest of MOVs" obfuscation technique attempting to crack a protection back in the very late 80s, and I remember it so well because it caused me to change my analysis strategy completely. The fun part was that the "interesting" MOVs were hidden amongst other instructions that seemed to perform useful computation, although the results of that were just thrown away and the MOVs were doing all the work. At the time I was fond of printing out code and inspecting/annotating it manually, so I think it took several days and lots of careful documenting of the algorithm before I realised that it was all useless; and upon tracing back the source of the actual value used in the decision and crossing out the irrelevant instructions, imagine my surprise when almost all that was left were MOVs...!
That one taught me it was far better to start by working backward from the result, although self-modifying code tends to be more difficult that way.
Strongest possible +1 for the slides if you are at all interested in low-level alchemy. Also see slide 109 for the beginning of a shadow argument that this might actually have some real-world utility, in that the long list of MOVs is virtually immune to comprehension by existing reverse engineering tools and practices.
That's a TTA (https://en.wikipedia.org/wiki/Transport_triggered_architectu... ), where effectively the ALU and other computation units become memory-mapped devices. It's the logical extension of how a lot of microcontrollers which don't have a multiply instruction in their instruction sets, e.g. 8051, will instead have a multiplier unit that's accessed by reading/writing special memory addresses.
That's somewhat different from the move-based code discussed here where the MOVs are actually performing the computation.
[+] [-] thesz|10 years ago|reply
The loop jump address sometimes changed for some effectful operations like printing or for optimizations like executing addition.
It took me about four hours to find the correct password. In the course of there three hours I wrote 1) an executor that used i386 debug registers to look for current MOV addresses, 2) a tracer that produced a trace and 3) a compactor which identified common instruction sequences and presented them as some macrocommand. It turned out the original source code has used macros in the opposite way. The final challenge was to write brute force password finder, which is not that hard at all (for 32-bit checksum).
All in x86 assembler. I guess it was about 1995-96, somewhere there.
Now I'd use the same technique, but on higher level. Instead of peephole compacting I'd use graph analysis, but that's about it. You can get pretty much everything from the program trace, I think this way you can get even more information than from disassembly.
So in my opinion, it is one hell of a cool experiment. But try not to use it as a real obfuscation device.
[+] [-] userbinator|10 years ago|reply
That one taught me it was far better to start by working backward from the result, although self-modifying code tends to be more difficult that way.
[+] [-] cautious_int|10 years ago|reply
[+] [-] patio11|10 years ago|reply
[+] [-] pkaye|10 years ago|reply
I've always felt these were more of a trick in being single instruction set because you are using some of the addressing bits to encode an opcode.
[+] [-] userbinator|10 years ago|reply
That's somewhat different from the move-based code discussed here where the MOVs are actually performing the computation.
[+] [-] foobar2020|10 years ago|reply
[+] [-] ishtu|10 years ago|reply
[+] [-] ericfrederich|10 years ago|reply
[+] [-] kazinator|10 years ago|reply
https://news.ycombinator.com/item?id=9751312
[+] [-] agumonkey|10 years ago|reply
[+] [-] ape4|10 years ago|reply
[+] [-] jschwartzi|10 years ago|reply
[+] [-] adamzubi700|10 years ago|reply
[deleted]