(no title)
mrow84 | 3 years ago
A=60 is the amount in our chosen envelope - we are given this information in the variant in this thread. What we still don't know is if that is the larger amount (and thus the other envelope contains 30, and therefore x=30), or the smaller amount (and the other envelope contains 120, and therefore x=60).
The error is (in step 7) to calculate the arithmetic expectation of those absolute values, because they do not exist "together". The correct arithmetic mean can be obtained by considering the different conditions in which those values do exist, as described in [0]. However, the ratios do exist "together" - the other envelope contains either double or half of 60 - so we could instead calculate the geometric mean, of either the ratios or the corresponding absolute values, and obtain the correct result:
(2 * 0.5) ** 0.5 = 1
(120 * 30) ** 0.5 = 60
[0] https://en.wikipedia.org/wiki/Two_envelopes_problem#Other_si...edit: changed “first envelope” to “chosen envelope” for clarity.
JoshCole|3 years ago
My correction is still valid though. You're not handling step ten properly. You didn't work over the information sets, didn't solve the actual graph that is the game, didn't handle the under-specified policy function.
To try and show you that your solution isn't the actual solution: well, both options have the same EV. So I choose switch every time, because why not. As you are no doubt aware I never get to have EV because I'm constantly swapping. The sixty is a mirage. For my policy choice, the answer was undefined or zero depending on how you write it down. But you told me they had the same EV. So if they did, why did my choice not produce that EV? Ergo, your solution only appears to be giving you the EV.
Think about that for a while and you'll start to realize why I honed in on specifying a recurrence relationship with the terminal keep node and why I'm so eager to escape the trap of their flawed problem model.
JoshCole|3 years ago
Lets take a step back and learn the important lesson for more complex situations. Your policy influences your expected value. Not keeping that in mind is going to destroy the ability to correctly calculate expected value. You aren't trying to search for the best thing to do on the basis of expected value. You are searching for the right policy to provoke a high expected value. The difference is subtle, but essential.
How do we correct it? Well, the right formalisms that allow you to search for the correct policy comes from several fields, but one of them is game theory. In game theory when dealing with imperfect information, it is considered incorrect to do state-based reasoning under imperfect information. This is because you aren't in a state - you are in an information set. When you are playing the game you have to consider every game you could be in, because you don't know which you are in.
This is a second problem with the analysis, but I think you corrected this one.
They ask to be able to translate this into more complex situation. So the general lesson here is about considering counterfactuals in your analysis. An example of this in practice is Jeff Bezo's talking about his decision to found Amazon on the basis of regret minimization on account of his theory about how he would feel in various counterfactual futures. He didn't consider one EV, the founding of Amazon, but also other EVs like founding Amazon and failing and also not founding Amazon and doing other things.
I think I get why you think I'm conflating A, but I'm actually trying to point out that the wikipedia article is conflating A and so its hard to have a productive discussion due to our inheritance of their misuse of terms. I don't want to conflate A, but the Wikipedia article defined A in their expected value calculation and in that equation it ends up taking on a different meaning to what it means when it is defined to be 60. And their meaning ends up claiming things like 1=2 in practice, because of the properties of the hidden counterfactual part of their equations - just because they neglect to show them, doesn't mean they don't exist in the correct mathematics. So the logical contradiction is there - which is exactly the thing the problem asks us to identify.