### Maximizing your earnings with money envelopes: a mathematical riddle

Here’s an interesting paradoxical puzzle told to me by my friend Bryan, known as the two envelopes problem, and my proposed resolution of it.

Suppose you are presented with two envelopes, envelope A and envelope B. One envelope contains twice the amount of money as the other, but otherwise the amounts could be *anything*. You are asked to choose an envelope, and you are allowed to keep all of its contents. Once you choose, you are offered the opportunity to switch. The question asks which option is better. While the answer may seem intuitive, consider this argument that gives a paradoxical result:

Call the amount of money in the envelope you selected \(a\). If you stay with your initial choice, you are guaranteed this amount.

On the other hand, if you switch, you have a 50% chance of ending up with an envelope containing \(2a\), and a 50% chance of getting the envelope containing \(\frac{1}{2} a\). So your expected value by switching is \(\frac{1}{2} (\frac{1}{2} a)+\frac{1}{2} (2 a)=\frac{5}{4} a\), which is slightly better than the expected amount you were guaranteed to find without switching.

The paradox is that this is absurd, since your initial choice was totally random to begin with, so switching cannot make any difference. What’s going on here?

#### Possible objection

Some people object to this riddle, saying that it is unfair to use \(a\) for two distinct possibilities — the larger value and the smaller value. They argue that this is the source of the paradox, and believe they have solved it.

However, this argument doesn’t work. If we wanted to, we could open up the envelope right after you make your initial selection, and *see* what the enclosed value is. This is a perfectly well defined quantity, and we call it \(a\).

They may make a counter-argument as follows:

Instead of defining \(a\), let’s define \(s\), a variable representing the smaller of the two values in the envelopes. Then half the time you choose \(s\), the other half you choose \(2s\), so your expected earnings is \(\frac{3}{2} s\). This is the answer regardless of your initial selection, so the paradox is solved.

While the argument sketched above is correct, it doesn’t prove that there’s something wrong with considering the quantity equal to the value contained in the chosen envelope. Our goal isn’t to solve the problem with *some* argument, it’s to discover what is flawed with the argument we started with. *That* is the paradox.

It’s easy to convince yourself that the problem is flawed in some fundamental way, but I think it’s worthwhile to think about this problem, because in thinking about these things, it gives us a better intuition for how to solve problems like it in the future.

#### Analysis

The following is my analysis of this problem, along with a resolution.

To be totally general, suppose we generate the envelope distribution by first selecting the lower-valued (or small) envelope from a known distribution \(\rho(x)\). We will make the simplifying assumption that money is continuous in the sense that it is infinitely divisible (we even allow for fractional pennies). For every one of these “small” envelopes in our sample space, there is exactly one envelope which contains twice the value. These envelope pairs are always drawn together. Then if we happen to know we have chosen the smaller envelope, we have

\(p\left(a=x\;\big|\;s\right)=\rho(x)\),

Where the notation on the left side means “Probability that the chosen envelope value is \(x\), given that the smaller envelope has been chosen (s for ‘small’)”.

Similarly, if we know we have chosen the larger envelope, then we know we have sampled a distribution scaled up by a factor of 2. After scaling and properly normalizing the distribution, we have

\(p\left(a=x\;\big|\;b\right)=\frac{1}{2}\rho(\frac{x}{2})\),

where again, the \(b\) means you have chosen the larger of the two envelopes (b for ‘big’).

Then it follows that if we know nothing about which envelope we picked, the probability distribution is

\(p\left(a=x\right)=\frac{\rho(x)+\frac{1}{2}\rho(\frac{x}{2})}{2}\).

The following graph shows an example probability distribution for the amount found given the small envelope is chosen (orange) and the large envelope is chosen (red).

Then we can compute our expected earnings by sticking with our first choice:

\(\displaystyle\int_0^\infty x \frac{\rho(x)+\frac{1}{2}\rho(\frac{x}{2})}{2} \; dx = \displaystyle\int_0^\infty x\;\rho(x)\; dx (\frac{1}{2}+\frac{1}{2}\cdot 2)=\boxed{\frac{3}{2} \left<x\right>_\rho}\).

This is the result we know must be correct whether we switch or not. But now it’s time to see if applying the reasoning from the original paradox still follows. What happens if we switch instead of staying with our original choice?

Now for any value of \(a\), there are two possibilities. We could have the smaller envelope, or we could have the larger envelope. But what are the chances that we chose the smaller envelope given \(a=x\)? Conversely, what are the chances we got the larger envelope given \(a=x\)? For this, we need Bayes’ Law:

\(P\left(A\big| B\right)=\frac{P(B| A) P(A)}{P(B)}\).

This works just as well for distributions as it does for absolute probabilities, or in our case, both at the same time. Using this format, we have

\(P\left(s\;\big|\;a=x\right)=\frac{\rho(x) \cdot \frac{1}{2}}{\frac{1}{2}(\rho(x)+\frac{1}{2}\rho(\frac{x}{2}))}=\frac{\rho(x)}{\rho(x)+\frac{1}{2} \rho(\frac{x}{2})}\),

and

\(P\left(b\;\big|\;a=x\right)=\frac{2 \rho(\frac{x}{2}) \cdot \frac{1}{2}}{\frac{1}{2}(\rho(x)+\frac{1}{2}\rho(\frac{x}{2}))}=\frac{\frac{1}{2}\rho(\frac{x}{2})}{\rho(x)+\frac{1}{2} \rho(\frac{x}{2})}\).

Then, to compute your expected earnings by switching, we compute

\(\displaystyle\int_0^\infty \underbrace{\frac{\rho(x)+\frac{1}{2}\rho(\frac{x}{2})}{2}}_{\text{Initially chose $x$}}\ \left[\frac{1}{2} x \underbrace{\frac{2 \rho(\frac{x}{2})}{\rho(x)+\frac{1}{2}\rho(\frac{x}{2})}}_{\text{Initially chose big given x}}+2x \underbrace{\frac{\rho(x)}{\rho(x)+\frac{1}{2}\rho(\frac{x}{2})}}_{\text{Initially chose small given x}}\right]\; dx \).

Luckily, the messy numerator of the first factor cancels with the denominator of both terms in the square brackets. This leaves us with

\(\displaystyle\int_0^\infty \frac{1}{2} \left[\frac{1}{2}\cdot x\cdot 2\;\rho(\frac{x}{2}) \cdot \frac{1}{2}+2x\cdot\rho(x) \cdot \frac{1}{2}\right]\; dx \).

We can substitute \(x\rightarrow 2x\) to make the integrand have only \(\rho(x)\), which gives

\(\displaystyle\int_0^\infty x \rho(x) (\frac{1}{2}+1)\; dx =\boxed{\frac{3}{2} \left<x\right>_\rho}\).

So in this analysis, we obtain the same answer by switching, as our intuition tells us.

#### The paradox revisited

You may not feel very satisfied about this, because even though we took the same route as the original paradox (but being more careful about conditional probabilities), we haven’t exactly put our finger on what is wrong with the initial argument.

In the original formulation, we made an assumption that the probability of choosing the larger (or smaller) envelope does not depend on the value of \(a\). Is this valid? It seems valid, since we assume that the probability distributions are uniform — that is, you are just as likely to have chosen any positive real amount of money. But this is precisely the problem. If we are assuming every amount of money is equally likely, this is an impossible distribution — but even if we pretend that such a distribution makes sense, there is no bound on how large a bounty can be concealed in the envelopes! Your expected earnings would already be infinite — and infinity times \(\frac{5}{4}\) is still infinite! Even in this unrealistic example there would be no paradox — the only paradox is that we are thinking of a problem where we expect to receive an unphysical amount of money.

Now consider the modification I made — I forced a probability distribution to make things more realistic. For any physically viable distribution, there must come a point where the distribution begins to drop off. For the sake of argument, suppose that change happens very abruptly: a positive probability of finding an amount smaller than a threshold \(t\), but zero probability density of finding any amount larger than \(t\) (in the diagram, this drop-off happens between 2 and 3). Now suppose that you pick an envelope, and it happens to have an amount greater than \(t\) (for instance, a value of 4 in the diagram). Now there is *no* chance that you have chosen the smaller of the two envelopes— after all, the probability of picking such a value given we have chosen the smaller envelope is zero!

This is where the 50-50 assumption breaks down in reality. The amazing fact is that the resolution to the paradox comes from the realization that we were making an incorrect assumption about the distribution of monetary amounts being uniform all the way to infinity.