top | item 41482310

(no title)

sounds like an interesting direction, but I don't understand why should we have it coupled to specific tool (pwndbg)? Why not implement a BinaryNinja plugin to dump all user-defined names (function names, stack variables), together with an original (stripped) binary to the new ELF/.exe file, with symbol table and presumably with DWARF section?

discuss

boricj|1 year ago

I've developed a Ghidra extension that exports object files. I've considered generating debugging symbols in order to improve the debugging experience when reusing these object files in new programs, but I keep postponing that feature for various reasons.

Executable formats have at least one and often multiple debugging data formats which are very different from each other: ELF has STABS and DWARF version 1 to 5, MSVC has at least COFF symbols and PDB (which isn't documented)... Even discarding the old or obsolete stuff, there's no universal solution here. gdb+pwndbg seems to side-step this issue by integrating the debugger with Binary Ninja.

Projecting reverse-engineered information into a debugging data format would also be a technical challenge once you go past global variables and type definitions. Debuggers already have a terrible user experience when stepping through functions in an optimized executable ; I doubt that reverse-engineered debugging data would be any better.

Toolchains also don't do a lot of validation or diagnostics on their inputs and I can tell from experience that writing correct object files from scratch is already quite tricky. I expect that serializing correct and meaningful debugging data would be much harder than that.

Doing this at the native executable level has the obvious advantage of working out of the box with standard tooling, but it would be a lot of work. I've already taken 2 1/2 years to make an object file exporter that's good enough for my needs and I'm still balking at generating DWARF debugging data every time I've considered it. I'm resigned to a terrible debugging experience and so far I've managed to muddle through it.

jcranmer|1 year ago

> Debuggers already have a terrible user experience when stepping through functions in an optimized executable ; I doubt that reverse-engineered debugging data would be any better.

Actually, I suspect it could be a world of difference. The main failures of optimized debugging are that code motion makes line number tables more of a suggestion, the source code values may disappear in favor of other related values (e.g., SROA or changing A[i] to p++), live ranges of variables may shrink, and debugging information may only support variables in stack slot or other specific locations. If you generating debugging based on decompiled code, you can control the output code to make the first two problems more or less go away, and so you only really have to worry about narrow live ranges (can't do much about that) or debuggers not supporting the features you need (which, given DWARF, is a very real possibility).

SoothingSorbet|1 year ago

> PDB (which isn't documented)...

What about https://github.com/Microsoft/microsoft-pdb ? Not technically documentation, but the closest thing to it.

5-|1 year ago

that's already available in binaryninja out of the box, via the (seemingly undocumented) 'plugins' -> 'export as dwarf' menu item.

unvariant|1 year ago

Not really, export as dwarf only gives types and symbols, it does not generate dwarf pc -> source line information, which this plugin provides via a custom context pane. Manually running the export every time a pc step occurs would also be quite painful.