top | item 32079720

(no title)

ishitatsuyuki | 3 years ago

Rellic [1] implements an algorithm that generates goto-free control flows (citation in README), which would be a significant improvement against what Ghidra/IDA generates currently.

Unfortunately it looks like the maintenance state of the pieces around Rellic isn't very good, and it's quite rocket science to get it building. It doesn't have as much UI/GUI as Ghidra either so it's a bit far from accessible right now.

[1]: https://github.com/lifting-bits/rellic

discuss

order

FrozenVoid|3 years ago

What happens with code that uses lots of gotos(incl. computed gotos)?

jcranmer|3 years ago

From reading the paper, it basically does jump unthreading. Basically, if you imagine code like this:

  bool found = false;
  for (...) {
    if (...) {
      found = true;
      break;
    }
  }
  if (found) {
    // A
  } else {
    // B
  }
Jump threading is an optimization pass that replace the break statement with a goto A. After that replacement, found is always false, so the boolean variable and the if statement is deleted. The resulting code would look something like this [1]:

  for (...) {
    if (...) {
      // A
      goto end;
    }
  }
  // B
  end:;
What the lifting is doing here is essentially running this pass in reverse. If you find a branch pattern that doesn't meet any preordained schema (such as a loop with multiple exits), just synthesize a variable that tells you which target you're going to jump to. Were the compiler to optimize the resulting code, jump threading would convert it back into the gotos present in the compiled binary.

[1] This kind of optimization pass runs at a stage when the code is basically treated entirely as a CFG and there's no such thing as if statements or jumps or gotos, just conditional and unconditional branches terminating basic blocks. Any reflection of the code outside of this form is therefore somewhat imprecise.

mdp2021|3 years ago

Can you think of an example of "goto"-based code that cannot be translated into conventionally structure code?

zozbot234|3 years ago

> Rellic [1] implements an algorithm that generates goto-free control flows

Doesn't WebAssembly implement that already, via Relooper?