top | item 46941836

(no title)

stevefan1999 | 21 days ago

I think one of the issue is that the register allocation algorithm -- alongside the SSA generation -- is not enough.

Generally after the SSA pass, you convert all of them into register transfer language (RTL) and then do register allocation pass. And for GCC's case it is even more extreme -- You have GIMPLE in the middle that does more aggressive optimization, similar to rustc's MIR. CCC doesn't have all that, and for register allocation you can try to do simple linear scan just as the usual JIT compiler would do though (and from my understanding, something CCC should do at a simple cost), but most of the "hard part" of compiler today is actually optimization -- frontend is mostly a solved problem if you accept some hacks, unlike me who is still looking for an elegant academic solution to the typedef problem.

discuss

order

adgjlsfhk1|21 days ago

Note that the LLVM approach to IR is probably a bit more sane than the GCC one. GCC has ~3 completely different IRs at different stages in the pipeline, while LLVM mostly has only canonical IR form for passing data around through the optimization passes (and individual passes will sometimes make their own temporary IR locally to make a specific analysis easier).

hackyhacky|21 days ago

What is the typedef problem?

nxobject|21 days ago

If stevefan1999's referring to a nasty frontend issue, it might be due to the fact that a name introduced by a typedef and an identical identifier can mingle in the same scope, which makes parsing pretty nasty – e.g. (example from source at end):

  typedef int AA;
  
  void foo()
  {
    AA AA;            /\* OK - define variable AA of type AA */
    int BB = AA * 2;  /\* OK - AA is just a variable name here \*/
  }

  void bar()
  {
    int aa = sizeof(AA), AA, bb = sizeof(AA);
  }

https://eli.thegreenplace.net/2011/05/02/the-context-sensiti...

I don't know off the top of my head whether there's a parser framework that makes this parse "straightforward" to express.

wahern|21 days ago

Lexical parsing C is simple, except that typedef's technically make it non-context-free. See https://en.wikipedia.org/wiki/Lexer_hack When handwriting a parser, it's no big deal, but it's often a stumbling block for parser generators or other formal approaches. Though, I recall there's a PEG-based parser for C99/C11 floating around that was supposed to be compliant. But I'm having trouble finding a link, and maybe it was using something like LPeg, which has features beyond pure PEG that help with context dependent parsing.