The E-graph extraction problem is NP-complete

[+] JonChesterfield|2 years ago|reply

Biggest tile first, nearest entry node first should be linear time.

Decent chance you can hack 'optimal' out in a reasonable timeframe if you run that heuristic first and then prune the search space to discard any solution that would be worse than said first guess.

I'm certainly not convinced by the argument that optimal egraph is equivalent to the minimum set cover problem and thus in NP. I can believe that egraph extraction can be modelled as minimum set cover, but there's a lot of additional structure in the egraph which I suspect takes us out of the general case.

By analogy, it is known that register allocation is NP complete because it's graph colouring. Despite that, register allocation on a chordal graph can be done in linear time, and SSA form gives you a chordal graph. Being a specific instance of a hard problem matters.

[+] mathisfun123|2 years ago|reply

>I'm certainly not convinced by the argument that optimal egraph is equivalent to the minimum set cover problem and thus in NP. I can believe that egraph extraction can be modelled as minimum set cover

Are you like actually misunderstanding or are you trying to say something else - egraph is in np because it's equivalent to ilp and it's complete because you can reduce set cover to it not the other way around (the reduction always goes to the problem not from the problem in a completeness proof).

>Despite that, register allocation on a chordal graph can be done in linear time, and SSA form gives you a chordal graph.

This isn't saying much since SSA construction (phi elimination) isn't linear.

[+] yarg|2 years ago|reply

I had no idea what an e-graph was until 5 minutes ago, but this has the feeling to me of something that is nominally NP complete.

Languages tend to be designed with comprehensibility in mind - so the thoroughly unjustifiable hunch that I'm having is that the vast majority of the extreme complexity possibilities are for languages that no sane human would design, let alone actually work with.

[+] deredede|2 years ago|reply

I agree that this result is wholly unsurprising (I had just assumed it to be true, but it is nice to have confirmation).

I am not sure I understand your comment on language structure. E-graphs are used to reason automatically about equivalence of (PL or) math expressions, there is quite a lot of possible combinations of assembly opcodes or other bytecodes and do not humans interact with them directly but compilers and formal tools have to.

(Edit: typo, meant humans do not interact with bytecode directly... Usually)

[+] NooneAtAll3|2 years ago|reply

I still don't have any idea what's an e-graph

[+] based2|2 years ago|reply

https://egraphs-good.github.io/

[+] guimplen|2 years ago|reply

The problem is not stated clearly.

[+] EdSchouten|2 years ago|reply

This blog post could really use links to other sources.

First of all, I had no idea what an E-graph was, so I had to search for that myself.

Then the post presents the problem itself, without explaining a practical situation in which you’d need to compute/solve this. Would this be used by a compiler to come up with an optional execution strategy? Or an interpreter to evaluate terms?

It talks about the use of a cost function, but to me it’s not clear where it comes from, and what the cost represents. Would that be CPU cycles? Memory use?

I decide to do a Google search for “E-graph extraction problem”, but apart from the blog post itself, I get no meaningful results. I feel lost.

[+] romes|2 years ago|reply

Great to see this! Funnily, I was just now working on a problem to which I'm trying to apply e-graphs (using my -- shameless plug -- haskell e-graphs/eqsat library[1])

[1] https://github.com/alt-romes/hegg

[+] nextaccountic|2 years ago|reply

Just curious, what's the problem you are solving?

[+] emmender|2 years ago|reply

Proving a problem is np-complete should not be news. what should be news is when a problem has a P algo. (example, primes is in P)

my cynical eye sees this as an over-eager grad student rushing out his/her discovery onto hacker news. Next thing you know, an FPTAS for it may rear its ugly head.

[+] deredede|2 years ago|reply

I disagree. E-graphs as a data structure are used to represent an exponential combination of terms in a manageable (ie polynomial) structure. Thus, it is interesting to know what the limits of that "compression" are - what we can do in the polynomial representation and when we have to fall back to an exponential (or make compromises).

[+] tekknolagi|2 years ago|reply

This is a breathtakingly rude comment. The blog post author is not some random grad student looking at another's work; he is actively working on egglog & is one of the key people driving work in the e-graphs space.

20 comments