top | item 38988948

Dynamic programming is not black magic

468 points| qsantos | 2 years ago |qsantos.fr | reply

202 comments

order
[+] akoboldfrying|2 years ago|reply
It's good that the article points out that DP algorithms are "just" clever ways to cache a recursion. Looking for a recursive solution is certainly the best way to start out looking for a DP solution, in my experience -- if you can find one, memoising it is trivial and may give a big speedup (and may even be faster than a "bottom-up" DP, since it only ever computes solutions that we definitely need).

With enough practice, it's usually possible to come up with a (correct but slow) recursive solution. When turning this into a DP, it doesn't matter if there are large numbers of subproblems in the call tree -- what's important is that there are a relatively small number of distinct subproblems. (Since there's no point caching a result that's only ever needed one time.) And that's where the difficulty tends to lie: Figuring out how to partition the original problem into few enough distinct subproblems.

[+] vishnugupta|2 years ago|reply
> what's important is that there are a relatively small number of distinct subproblems.

This is the crucial part IMO. Whether the larger algorithm is recursive or iterative etc is secondary. But yes DP usually tends to show up in recursive algorithms most often.

[+] nsonha|2 years ago|reply
> DP algorithms are "just" clever ways to cache a recursion

This is what made it click to me. When I was in university, perhaps because of the prevalence of procedural programming at the time (as opposed to FP), DP looked magic to me with the bottom-up tabularization examples in the textbook.

Practically that is the way to do it because tail-call elimination isn't always applicable, but I wish they's taught us the more intuitive way to look at it first (top-down, cache a recursion)

[+] Shorel|2 years ago|reply
Agree. When I first learned the subject, my first thought was:

As fancy as that feature is, it should be called array memoization, or call stack memoization.

The term "dynamic programming" should be reserved for something better. IMO.

[+] da39a3ee|2 years ago|reply
This is a widespread misconception: thinking of dynamic programming as just a form of memoized recursion is not the way to learn DP because it makes it extremely difficult to understand how to do the style of DP problems that involve filling out a 2D array.

For example, look at the "best time to buy and sell stock" series on leetcode: https://leetcode.com/problems/best-time-to-buy-and-sell-stoc....

These are much more naturally done by filling out an array, right? I've never done them with a recursion; I can't say I've thought hard about it -- is there a natural recursive solution?

(I linked to iii above but for anyone who hasn't tried them they are a great intro to DP problems; start with the first one: https://leetcode.com/problems/best-time-to-buy-and-sell-stoc...)

[+] dataflow|2 years ago|reply
The best analogy I have here is that saying "DP is just caching/memoization" is like saying "investment is just buying stuff and selling them later." Regardless of how technically accurate it might be, it misses such massive complexities and difficulties in the topic that it just makes you look silly rather than insightful.
[+] toomanyrichies|2 years ago|reply
Origin of the name "dynamic programming", from its inventor, Richard Bellman:

"I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes. An interesting question is, ‘Where did the name, dynamic programming, come from?’

The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence.

You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose?

In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various reasons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense.

Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities”.

Source:

https://alliance.seas.upenn.edu/~cis520/dynamic/2021/wiki/in...

[+] bombela|2 years ago|reply
I really like how this article exposes the problem recursively first then progressively adds caching, and finally reduces the size of the cache to only what is necessary.

I have often made the mistake of trying to get to the dynamic programming solution directly, and either got stuck or had to go through heroic efforts to get it working. I think from now on, I will force myself to go through the steps in order.

[+] marcodiego|2 years ago|reply
I had a very good algorithms professor. He studied at UCLA. His classes about dynamic programming were superb. He started with a problem for which the naive solution was simple and had exponential complexity. Then he broke the problem into smaller problems and reduced the complexity to something polynomial. Then he applied memoisation and the complexity dropped to linear.

I really would like to remember the problems he used.

[+] Horffupolde|2 years ago|reply
1. *Fibonacci Sequence*: The classic example where the naive recursive solution has exponential complexity. By storing previously computed values (memoization), the complexity can be reduced to linear.

2. *Coin Change Problem*: Given different denominations of coins and a total amount, finding the number of ways to make the change. The naive approach is exponential, but dynamic programming reduces it to polynomial complexity.

3. *Knapsack Problem*: Particularly the 0/1 Knapsack problem, where items with given weights and values must be placed in a knapsack of a fixed capacity to maximize total value. The naive exponential solution can be optimized using dynamic programming.

4. *Matrix Chain Multiplication*: Determining the most efficient way to multiply a chain of matrices. The problem can be solved in exponential time using a naive approach but becomes much more efficient with dynamic programming.

5. *Longest Common Subsequence*: Finding the longest subsequence common to two sequences. A classic dynamic programming problem that can be solved in polynomial time.

6. *Longest Increasing Subsequence*: Finding the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.

7. *Shortest Path Problems*: Like the Floyd-Warshall algorithm for finding the shortest paths in a weighted graph with positive or negative edge weights.

8. *Edit Distance (Levenshtein Distance)*: Finding the minimum number of edits (insertions, deletions, substitutions) needed to change one word into another.

[+] jakeinspace|2 years ago|reply
In addition to the suggested problems others posted, perhaps it was a scheduling problem? Like, for example, scheduling N overlapping events in time, like course schedules or processes for a CPU. Generally this would be done to optimize something like throughput - I believe that when you start adding special requirements, like "These 2 courses must be taken together", then things get much more complicated and intractable compared to plain-old dynamic programming.
[+] colordrops|2 years ago|reply
Did he study under Kang at UCLA?
[+] tromp|2 years ago|reply
Dynamic programming is what allowed me to compute the number of legal Go positions [1,2], a 171 digit number.

While the naive method would take time 3^n^2 to examine all possible positions on an nxn board, dynamic programming effectively strips away one dimension, reducing time to O(n^5 * 5.4^n), while using space O(n * 5.4^n).

[1] https://tromp.github.io/go/legal.html

[2] https://tromp.github.io/go/gostate.pdf

[+] qsantos|2 years ago|reply
Interesting, I have never really looked into this kind of calculation. Thanks for the links!
[+] _v7gu|2 years ago|reply
The name "Dynamic Programming" might seem out of place because it doesn't come from programming as a discipline. In this case, it actually refers to something like optimization, similar to linear programming. Dynamic programming is basically a method you can use to solve decision problems with discrete time, i.e picking the optimal sequence {a_t} in order to maximize \sum_t u_t(a_t) (plus constraints). The "dynamic programming" is defining a value function V* where V*(t) = max_{a_t}{ u_t(a_t) + V*(t-1) } which greatly reduces the dimensionality of the optimization problem.
[+] roenxi|2 years ago|reply
In fact, the official line [0] on where the name comes from is quite funny. I shall quote my favourite part:

> [Dynamic] also has a very interesting property as an adjective, and that is it's impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It's impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to.

[0] https://en.wikipedia.org/wiki/Dynamic_programming#History

[+] glimshe|2 years ago|reply
Most of the time I hear the term being used by other people (not you :) ), I feel it's for showing off and look smarter than everybody else - "Look, I used dynamic programming to solve that problem", when in fact they were just employing what I'd consider the natural and intuitive approach for solving the problem. There was nothing they actually "used", besides happening to identify that the problem could be broken into progressively smaller subproblems.
[+] closeparen|2 years ago|reply
It's interesting how computing things - like optimization problems - used to be much more dominant in terms of what people thought about and did with computers. It feels like most of the time we are just doing data storage and retrieval combined with networking... some computations are embedded within those things, but usually well encapsulated.
[+] layer8|2 years ago|reply
“Optimization” is similarly misleading (as someone who once took a CS class on “optimization” expecting something very different ;)).
[+] uoaei|2 years ago|reply
If you go further back, "programming" describes it precisely, and what we call "programming" today is "writing code" which can be subdivided into different kinds of "programming" such as functional, declarative, procedural, etc. But there's a lot more that fits under that umbrella.
[+] eru|2 years ago|reply
Compare also 'Linear Programming'. Or the usage of a 'TV programme' or a 'musical programme'.
[+] cubefox|2 years ago|reply
You forgot to define u and t.
[+] sethammons|2 years ago|reply
Would it be wrong to just think "memoization" when you hear "dynamic programming"? The missing part may be intelligently breaking up the problem to use memoization?
[+] jltsiren|2 years ago|reply
Memoization is a more general technique. It's often little more than caching the results you happened to compute, in case you need them again in the future.

Dynamic programming is systematic memoization. You solve subproblems of increasing size, until you reach a solution to the full problem. "Inductive algorithm" would be kind of appropriate term, as the typical dynamic programming algorithm is effectively a proof by induction. Unfortunately that term already has other meanings.

[+] JohnKemeny|2 years ago|reply
That's exactly how I teach dynamic programming. First solve it recursively, then add memoization (I call that top-down).

Then notice that recursion and memoization has some overhead, and construct the table from bottom-up, remove the recursive call, and voila, dynamic programming.

[+] GuB-42|2 years ago|reply
To my approach to dynamic programming, memoization is step 2 of 3.

1- find a recursive algorithm

2- memoization

3- make it iterative / bottom-up

3b- if possible, optimize for space

Step 3 is the most characteristic part of dynamic programming, but if you stopped at step 2, it would still qualify, I think, but it would not be as efficient as it could be.

Another way to think of step 3 would be: memoization is about caching, is there a way to pre-fill the cache?

[+] kevindamm|2 years ago|reply
There are dynamic programming solutions not based on memoization. See for example finding the longest common substring between two strings. Memoization will not help as much because you only need the table's left and up cells once.

In general, if the problem's sub-problems have a lot of overlap and the optimal subproblem must be part of the optimal overall solution, you've got an opportunity for dynamic programming. Saying memoization is the only kind of dynamic programming is like saying hash tables are the only abstract data type.

[+] ecshafer|2 years ago|reply
Yes it would be wrong in my book. First i think a obvious counter example of memoization can be used outside of dynamic programing. Otherwise you can do most dynamic programming algorithms with storing the results in a table, and searching the table afterwards for the best answer. Memoization is basically a strategy to speed up algorithms.
[+] dps|2 years ago|reply
>This year’s Advent of Code has been brutal (compare the stats of 2023 with that of 2022, especially day 1 part 1 vs. day 1 part 2).

I enjoyed completing AoC this year. While it was very clear that day 1 (esp. part 2) was significantly harder than previous years (I wrote about this among other things [0]), OP's claim seemed not obviously self evident when comparing _current_ 2022 stats to _current_ 2023 stats, as folks have had an additional full year to complete the 2022 puzzles.

I grabbed the 2022 stats from Jan 14th 2023 [1] and, indeed, the difference is quite stark. Graphing the part two completion stats[2] for both years, there was a relatively similar starting cohort size on day 1, but 2023 looks clearly harder than 2022 up until day 15. As OP observes, the ratio[3] of folks completing pt1 but not going on to complete pt 2 is way higher for a lot of days in 2023 and suggests the day 5, 10, 12 and especially day 22 part 2s were particularly difficult.

[0] https://blog.singleton.io/posts/2024-01-02-advent-of-code-20...

[1] https://web.archive.org/web/20230114172513/https://adventofc...

[2] https://blog.singleton.io/static/imgs-aoc23/completion.png

[3] https://blog.singleton.io/static/imgs-aoc23/ratios.png

[+] Kwpolska|2 years ago|reply
Early AoC was fun, you could get away without anything fancy until late in the game. Then it got harder, not fun, so I gave up and stopped touching it.
[+] benrow|2 years ago|reply
I didn't get very far into AoC this year as I ran out of time. Maybe I'll pick it up again later.

But my point is, I was surprised at how hard day 5, part 2 was. I didn't give up and solved it, but went away wondering whey I'd missed something obvious and overcomplicated it. So it brings some relief to know it was 'supposed" to be a bit challenging!

[+] binglebob|2 years ago|reply
This was just my personal experience (which certainly came from trying out a different language than I typically use in my day to day), but I'd argue that day 1 part 2 wasn't _hard_, but improperly specified from the prompt. The examples given are:

  two1nine
  eightwothree
  abcone2threexyz
  xtwone3four
  4nineeightseven2
  zoneight234
  7pqrstsixteen
There is one critical example missing from this set and you can't exactly just figure out how you're meant to substitute the values without an example like:

  oneight
[+] qsantos|2 years ago|reply
Thanks for the details. To add to this discussion, I have a script to see the progression over the days.

Looking at the last two columns, you can see how brutal 2023 was compared to 2022. Especially in the beginning. The first few days, most people keep playing, with a retention higher than 80% most days, and virtually everyone people solve both parts. In contrast, only 76% of people solved part 2 after solving part 1. And many people gave up on days 3 and 5.

Interestingly, the last few days are not that much lower. And that can be explained by the fact that AoC 2023 is more recent than AoC 2022, like you said. My interpretation is that this group of people will get over all the challenges regardless of the difficulty (to an extent, of course), while many other people will give up when they realize it will take too much of their time.

    Stats for year 2022 of Advent of Code
    -------------------------------------
    
    Day   Both puzzles   One puzzle       Total   Rel. puzzle 1/2   Rel. day before
      1        280,838       15,047     295,885              95 %             100 %
      2        232,752       12,403     245,155              95 %              83 %
      3        200,016       11,392     211,408              95 %              86 %
      4        184,435        3,734     188,169              98 %              92 %
      5        157,392        3,116     160,508              98 %              85 %
      6        155,921        1,602     157,523              99 %              99 %
      7        113,241        2,592     115,833              98 %              73 %
      8        107,224        7,659     114,883              93 %              95 %
      9         82,414       11,449      93,863              88 %              77 %
     10         85,075        5,511      90,586              94 %             103 %
     11         68,838        9,258      78,096              88 %              81 %
     12         59,253        1,061      60,314              98 %              86 %
     13         51,512        1,220      52,732              98 %              87 %
     14         49,051          991      50,042              98 %              95 %
     15         39,677        5,773      45,450              87 %              81 %
     16         23,298        5,650      28,948              80 %              59 %
     17         21,525        6,237      27,762              78 %              92 %
     18         25,420        4,927      30,347              84 %             118 %
     19         17,516          928      18,444              95 %              69 %
     20         22,141        1,003      23,144              96 %             126 %
     21         23,022        3,060      26,082              88 %             104 %
     22         15,393        5,083      20,476              75 %              67 %
     23         18,531          254      18,785              99 %             120 %
     24         16,419          252      16,671              98 %              89 %
     25         13,192        7,473      20,665              64 %              80 %
    ~/src/advent-of-code% ./stats.py 2023
    Stats for year 2023 of Advent of Code
    -------------------------------------
    
    Day   Both puzzles   One puzzle       Total   Rel. puzzle 1/2   Rel. day before
      1        230,737       73,941     304,678              76 %             100 %
      2        196,352        9,256     205,608              95 %              85 %
      3        130,406       19,913     150,319              87 %              66 %
      4        130,271       17,691     147,962              88 %             100 %
      5         80,255       31,029     111,284              72 %              62 %
      6        103,358        1,918     105,276              98 %             129 %
      7         81,905        7,308      89,213              92 %              79 %
      8         74,034       14,707      88,741              83 %              90 %
      9         76,438        1,229      77,667              98 %             103 %
     10         48,313       17,054      65,367              74 %              63 %
     11         57,339        2,386      59,725              96 %             119 %
     12         30,985       14,440      45,425              68 %              54 %
     13         38,217        5,223      43,440              88 %             123 %
     14         36,500        7,457      43,957              83 %              96 %
     15         40,881        4,156      45,037              91 %             112 %
     16         35,347        1,023      36,370              97 %              86 %
     17         24,014        1,097      25,111              96 %              68 %
     18         24,799        4,937      29,736              83 %             103 %
     19         22,525        7,197      29,722              76 %              91 %
     20         18,287        4,398      22,685              81 %              81 %
     21         14,311       10,149      24,460              59 %              78 %
     22         15,830          988      16,818              94 %             111 %
     23         14,562        2,964      17,526              83 %              92 %
     24         11,864        4,918      16,782              71 %              81 %
     25         10,522        3,048      13,570              78 %              89 %
[+] ctk_25|2 years ago|reply
Would recommend DPV Algorithms book and Georgia Techs lectures on udacity for graduate algorithms. The way to master dynamic programming is - practice practice practice solving problems…
[+] mbwgh|2 years ago|reply
I find it disappointing that the "coming up with a recurrence formula" part is always glanced over, which to me is very much the hard part.
[+] tnecniv|2 years ago|reply
I think that CS classes also teach it really poorly. I did not understand it at all until I took an optimal control class and it instantly made sense
[+] Futurebot|2 years ago|reply
I like the short forumlation to explain it: "Dynamic Programming is approximately recursion + memoization + guessing"
[+] stevefan1999|2 years ago|reply
Dynamic programming is not a black magic, _proving it can be dynamically programmed and their correctness_ is. You need to use mathematical induction to formally proof it just like with greedy algorithms which are very counterintuitive (take for example, greedy coloring) when I learned them both in uni.
[+] thayne|2 years ago|reply
As a primarily self-taught programmer, when I was first looking for a job, after college if I had been asked on an interview to use dynamic programming I would have had no idea what that was. Thankfully that never happened to me. But I was familiar with the technique, and used it in multiple interviews.
[+] rav|2 years ago|reply
> And many common algorithms are actually just the application of dynamic programming to specific problems, including omnipresent path-finding algorithms such as Dijkstra’s algorithm.

Dijkstra's algorithm is an application of dynamic programming? I disagree. In dynamic programming, you tabulate the subproblems to solve, with static interdependencies between them leading to straightforward orders in which to solve the subproblems. In Dijkstra's algorithm, you need to compute the shortest path from s to each vertex, but the order in which you have to visit the vertices is only discovered along the way using a priority queue, so the subproblem interdependencies are not known ahead of time until you have actually solved the problem.

[+] vladimirralev|2 years ago|reply
Dijkstra is definitely dynamic programming, no doubt about it, you are still computing a new state from neighbouring state nodes while ignoring substantial number of non-neighbouring states.

You just have to accept the more abstract definition of "state" where the state encodes subsets of the graph. The states in Dijkstra are represented by subgraphs and distances that incrementally include more nodes leaving us with a path of states to follow in some order.

It's not much different from the traveling salesman or viterbi in that sense. You come up with some topological order of states in abstract state-space and then follow that topological order to compute the new state based on the adjacent previously computed states and never look back to update already finalised states.

With this more abstract point of view, it's clear Dijsktra is dynamic programming.

There is a whole field of graph dynamic programming problems. And the closely related Markov chain problems.

[+] coolThingsFirst|2 years ago|reply
It's black magic bcs ppl don't want to admit they don't understand recursion lol. Fibonacci has always been a bad example to teach recursion. That and towers of hanoi are the greatest failure in teaching CS
[+] hxypqr|2 years ago|reply
Most of Dynamic programming is just a method of reducing computational complexity by changing the noun objects in first-order logic (or second-order logic, advanced version) to walk through the answers of unfinished tasks using completed tasks. Only in very few cases is it necessary to extract and match the completed parts from the unfinished objects in the above process, which often involves optimizing a function f(A,B). However, most of the time, this process is futile.
[+] ram_rar|2 years ago|reply
I wonder how much mechanical sympathy the Dynamic programming (DP) has as opposed to brute force or other simpler techniques. On paper it seems like a smart technique, in reality, I have rarely seen that being used in critical production systems. In fact, anything recursion-esque is eschewed. Can someone comment on their experience in implementing DP in critical production systems or beyond the scope of programming contest, academia.