top | item 39665258

(no title)

tety | 2 years ago

Can you share some of the tools and methods you used to reverse such a large C++ codebase into readable code?

discuss

order

Hamuko|2 years ago

>"GTA 3 and Vice City were originally written in C++," aap explains. "The compiled executables that are shipped are in machine code. So the general task is to go from machine code back to C++.

>"Machine code can be (more or less) mapped 1:1 to a human readable form called assembly language, but it's still very tedious to read.

>"To go back to C++ is by no means a simple 1:1 mapping, but over the last 10 or so years decompilers have appeared that help with this process.

>"So what we typically do is work with the output of the decompiler and massage it back into readable C++. This is sometimes quite easy and sometimes hard, but in any case it's a lot of code and you're bound to make mistakes."

>Thankfully, the code for GTA 3 on PS2 and Android includes debug symbols. Debug symbols contain all the extra information needed to debug a game during the development process, but are often stripped out for release executables to avoid bloat. For whatever reason, Rockstar left these symbols in, giving the reverse-engineering team a huge leg-up.

>"We were very lucky we had symbols for the games," aap says. "PS2 [GTA] 3 and all the Android releases have names for the global stuff (functions and global variables). This was a huge help and I don't think we'd be anywhere near reversed GTA without them."

https://www.eurogamer.net/how-a-small-group-of-gta-fanatics-...

euazOn|2 years ago

Im curious, since it seems to be mostly about walking back the one-way-street of C++ to assembly, what’s stopping us from training a ML model to help with decompilation by figuring out patterns in the compilation process? (there has to be a reason, otherwise someone would have done it already)

lstodd|2 years ago

Well you can look into how DFHack for Dwarf Fortress was redone. It involved a C++ decompiler written in clisp.

Since then Ghidra was released, so the process is somewhat simpler, if not as much *fun*.