top | item 6990750

(no title)

tsahyt | 12 years ago

Many NP-hard or NP-complete problems have (infinitely many) instances that are "easy" to solve. It is quite straight forward to write up a procedure that produces infinitely many instances of 3-SAT that can be solved in polynomial time by the DPLL algorithm. But it is important to note that a problem is actually a set of instances.

Your observation basically boils down to "how rare is the worst case?". There are actually algorithms with exponential worst case complexity that are used all over the place because - on average - they perform really well. The simplex algorithm for linear programming is an example for such. Every version of this algorithm comes with a worst case problem such that the algorithm has exponential runtime. However, on average it is quite a fast algorithm that can handle millions of variables and corresponding constraints on modern hardware. Still, that doesn't change the fact that linear programming is - from a theoretical point of view - still hard enough that we cannot generally solve it faster than in exponential time. As is the case with DPLL, good heuristics can speed up an algorithm a lot without changing worst case complexity. Other examples are all over the place in AI. It's also in AI where the concept of pruning is probably encountered most often. This the practice of discarding whole subtrees of the search tree by some sort of heuristic or deductive reasoning.

Those things aside, I think what we should be looking at from a theoretical point of view is when it makes sense to consider the average case complexity over the worst case complexity. This brings us back to "how rare is the worst case". Quicksort for instance is said to be an O(n log n) algorithm but it has O(n^2 ) worst case complexity (trivially so). However, it can be decided in O(n) whether this case actually occurs and hence a good sorting implementation could switch to a different algorithm once this has been detected (something similar happens in the std::sort function that's been implemented in the STL). This is definitely a case where it makes sense to "forget" worst case complexity.

For SAT, researchers have actually worked up statistics on how hard certain instances are in terms of the number of clauses and the number of symbols. It turns out that there is a range of ratio between the two where problems are the hardest (roughly between clauses/symbols is between 4 and 6). I don't have a source right now but googling "satisfiability threshold conjecture" might turn up some results.

As such, yes, not all instances of NP-hard problems are actually hard to solve, but there are infinitely many that are. Encountering such problems in practice one can usually make use of the special problems of the instance (or class of instances) one actually encounters and devise an algorithm that performs rather well on average.

discuss

No comments yet.