top | item 6986056

(no title)

tsahyt | 12 years ago

This might come across as quite harsh but the article could have been much shorter.

NP is the class of decision problems that can be solved by a non-deterministic Turing machine in polynomial time. A problem is called NP-hard when there is a polynomial-time reduction of 3-SAT to said problem. A problem is called NP-complete if it is both NP-hard and in NP. Since TSP is not a decision problem, it is not in NP, therefore it cannot be NP-complete. However, it is NP-hard.

One can call this a minor nitpick. I still point it out every now and then when I hear people talking about such things. However, the really important thing is that we haven't found a way to solve NP-hard (and hence of course NP-complete) problems efficiently on a deterministic Turing machine (and hence computers as we can actually build them). An informal way of saying the same thing is "TSP is really really hard to solve".

Either way, I'm always surprised at how many problems we encounter in our every day lives are actually NP-hard. Every decent puzzle game for instance is usually NP-hard. Goes to show that we don't really enjoy solving easy problems.

EDIT: Of course there's also a decision version of TSP, as others have pointed out. This one is NP-complete of course.

discuss

gizmo686|12 years ago

>Every decent puzzle game for instance is usually NP-hard.

An interesting question is how many of these puzzles are NP-hard in practice. For example, Minesweeper in NP-hard. However, it is possible to solve many games of Minesweeper using simple deductive reasoning in polynomial time. This leads to the question of what is the average difficulty of a random game. My best attempt at formalizing this question is to imagine reducing every game with a polynomial time algorithm, then look at the complexity of the remaining games. The complexity of an original game of size n could be the average after you reduce all games of size n with your polynomial time algorithm. If this new complexity is sub exponential than it feels as if there is some sense in which the game is not often NP-hard.

A more concrete example that comes to mind is Haskell's type inference. In general this type inference is NP-hard. However, there are algorithms that can solve it in polynomial time assuming a limit to nesting. Because no such limit actually exists, type inference is still exponential. But, because we never see arbitrarily deep nesting (except for contrived examples), this does not matter.

tsahyt|12 years ago

Many NP-hard or NP-complete problems have (infinitely many) instances that are "easy" to solve. It is quite straight forward to write up a procedure that produces infinitely many instances of 3-SAT that can be solved in polynomial time by the DPLL algorithm. But it is important to note that a problem is actually a set of instances.

Your observation basically boils down to "how rare is the worst case?". There are actually algorithms with exponential worst case complexity that are used all over the place because - on average - they perform really well. The simplex algorithm for linear programming is an example for such. Every version of this algorithm comes with a worst case problem such that the algorithm has exponential runtime. However, on average it is quite a fast algorithm that can handle millions of variables and corresponding constraints on modern hardware. Still, that doesn't change the fact that linear programming is - from a theoretical point of view - still hard enough that we cannot generally solve it faster than in exponential time. As is the case with DPLL, good heuristics can speed up an algorithm a lot without changing worst case complexity. Other examples are all over the place in AI. It's also in AI where the concept of pruning is probably encountered most often. This the practice of discarding whole subtrees of the search tree by some sort of heuristic or deductive reasoning.

Those things aside, I think what we should be looking at from a theoretical point of view is when it makes sense to consider the average case complexity over the worst case complexity. This brings us back to "how rare is the worst case". Quicksort for instance is said to be an O(n log n) algorithm but it has O(n^2 ) worst case complexity (trivially so). However, it can be decided in O(n) whether this case actually occurs and hence a good sorting implementation could switch to a different algorithm once this has been detected (something similar happens in the std::sort function that's been implemented in the STL). This is definitely a case where it makes sense to "forget" worst case complexity.

For SAT, researchers have actually worked up statistics on how hard certain instances are in terms of the number of clauses and the number of symbols. It turns out that there is a range of ratio between the two where problems are the hardest (roughly between clauses/symbols is between 4 and 6). I don't have a source right now but googling "satisfiability threshold conjecture" might turn up some results.

As such, yes, not all instances of NP-hard problems are actually hard to solve, but there are infinitely many that are. Encountering such problems in practice one can usually make use of the special problems of the instance (or class of instances) one actually encounters and devise an algorithm that performs rather well on average.