>"GTA 3 and Vice City were originally written in C++," aap explains. "The compiled executables that are shipped are in machine code. So the general task is to go from machine code back to C++.
>"Machine code can be (more or less) mapped 1:1 to a human readable form called assembly language, but it's still very tedious to read.
>"To go back to C++ is by no means a simple 1:1 mapping, but over the last 10 or so years decompilers have appeared that help with this process.
>"So what we typically do is work with the output of the decompiler and massage it back into readable C++. This is sometimes quite easy and sometimes hard, but in any case it's a lot of code and you're bound to make mistakes."
>Thankfully, the code for GTA 3 on PS2 and Android includes debug symbols. Debug symbols contain all the extra information needed to debug a game during the development process, but are often stripped out for release executables to avoid bloat. For whatever reason, Rockstar left these symbols in, giving the reverse-engineering team a huge leg-up.
>"We were very lucky we had symbols for the games," aap says. "PS2 [GTA] 3 and all the Android releases have names for the global stuff (functions and global variables). This was a huge help and I don't think we'd be anywhere near reversed GTA without them."
Im curious, since it seems to be mostly about walking back the one-way-street of C++ to assembly, what’s stopping us from training a ML model to help with decompilation by figuring out patterns in the compilation process? (there has to be a reason, otherwise someone would have done it already)
The history[1] section in the README file contains a description of how they did it. In summary, it seems they debugged GTA3 and wrote their stub implementations until the game was all reimplemented, presumably with only the assets from Rockstar’s GTA3.
Hamuko|2 years ago
>"Machine code can be (more or less) mapped 1:1 to a human readable form called assembly language, but it's still very tedious to read.
>"To go back to C++ is by no means a simple 1:1 mapping, but over the last 10 or so years decompilers have appeared that help with this process.
>"So what we typically do is work with the output of the decompiler and massage it back into readable C++. This is sometimes quite easy and sometimes hard, but in any case it's a lot of code and you're bound to make mistakes."
>Thankfully, the code for GTA 3 on PS2 and Android includes debug symbols. Debug symbols contain all the extra information needed to debug a game during the development process, but are often stripped out for release executables to avoid bloat. For whatever reason, Rockstar left these symbols in, giving the reverse-engineering team a huge leg-up.
>"We were very lucky we had symbols for the games," aap says. "PS2 [GTA] 3 and all the Android releases have names for the global stuff (functions and global variables). This was a huge help and I don't think we'd be anywhere near reversed GTA without them."
https://www.eurogamer.net/how-a-small-group-of-gta-fanatics-...
euazOn|2 years ago
supriyo-biswas|2 years ago
[1] https://github.com/halpz/re3?tab=readme-ov-file#history
lstodd|2 years ago
Since then Ghidra was released, so the process is somewhat simpler, if not as much *fun*.