Movfuscator: A single-instruction C compiler

[+] thesz|10 years ago|reply

Back in the time of FIDO, I've accepted the challenge to crack a program (find a correct password) that more or less consisted of a loop to simulate three-address MOV instruction.

The loop jump address sometimes changed for some effectful operations like printing or for optimizations like executing addition.

It took me about four hours to find the correct password. In the course of there three hours I wrote 1) an executor that used i386 debug registers to look for current MOV addresses, 2) a tracer that produced a trace and 3) a compactor which identified common instruction sequences and presented them as some macrocommand. It turned out the original source code has used macros in the opposite way. The final challenge was to write brute force password finder, which is not that hard at all (for 32-bit checksum).

All in x86 assembler. I guess it was about 1995-96, somewhere there.

Now I'd use the same technique, but on higher level. Instead of peephole compacting I'd use graph analysis, but that's about it. You can get pretty much everything from the program trace, I think this way you can get even more information than from disassembly.

So in my opinion, it is one hell of a cool experiment. But try not to use it as a real obfuscation device.

[+] userbinator|10 years ago|reply

I also remember seeing a "forest of MOVs" obfuscation technique attempting to crack a protection back in the very late 80s, and I remember it so well because it caused me to change my analysis strategy completely. The fun part was that the "interesting" MOVs were hidden amongst other instructions that seemed to perform useful computation, although the results of that were just thrown away and the MOVs were doing all the work. At the time I was fond of printing out code and inspecting/annotating it manually, so I think it took several days and lots of careful documenting of the algorithm before I realised that it was all useless; and upon tracing back the source of the actual value used in the decision and crossing out the irrelevant instructions, imagine my surprise when almost all that was left were MOVs...!

That one taught me it was far better to start by working backward from the result, although self-modifying code tends to be more difficult that way.

[+] cautious_int|10 years ago|reply

I suggest taking a look at the slides, which show how much trickery is involved: https://github.com/xoreaxeaxeax/movfuscator/raw/master/slide...

[+] patio11|10 years ago|reply

Strongest possible +1 for the slides if you are at all interested in low-level alchemy. Also see slide 109 for the beginning of a shadow argument that this might actually have some real-world utility, in that the long list of MOVs is virtually immune to comprehension by existing reverse engineering tools and practices.

[+] pkaye|10 years ago|reply

The Maxim Integrated MAXQ is is one commercial processor that uses a MOV based instruction set. http://www.maximintegrated.com/en/app-notes/index.mvp/id/322...

I've always felt these were more of a trick in being single instruction set because you are using some of the addressing bits to encode an opcode.

[+] userbinator|10 years ago|reply

That's a TTA (https://en.wikipedia.org/wiki/Transport_triggered_architectu... ), where effectively the ALU and other computation units become memory-mapped devices. It's the logical extension of how a lot of microcontrollers which don't have a multiply instruction in their instruction sets, e.g. 8051, will instead have a multiplier unit that's accessed by reading/writing special memory addresses.

That's somewhat different from the move-based code discussed here where the MOVs are actually performing the computation.

[+] foobar2020|10 years ago|reply

The x86 is actually Turing-complete without even executing a single instruction. Page faulting is enough: https://github.com/jbangert/trapcc

[+] ishtu|10 years ago|reply

Author is the same person who published epic X86 vulnerability https://github.com/xoreaxeaxeax/sinkhole

[+] ericfrederich|10 years ago|reply

I thought those slides looked familiar. Thought it might have been a template that the conference provided. Should have looked at the author ;-)

[+] kazinator|10 years ago|reply

Exact dupe:

https://news.ycombinator.com/item?id=9751312

[+] agumonkey|10 years ago|reply

Previously: https://news.ycombinator.com/item?id=9751312 https://news.ycombinator.com/item?id=6309631

[+] ape4|10 years ago|reply

Write your program using the nearly Turing-complete C preprocessor and compile into mov.

[+] jschwartzi|10 years ago|reply

Is ARMv7 mov turing complete?

[+] adamzubi700|10 years ago|reply

[deleted]

18 comments