Andrew Gelman likes us to "accept uncertainty and embrace variation" (link).

I was lucky to earn my Operations Research degree in a department filled with professors who have advanced our thinking about stochastics, learning from the likes of Erhan Cinlar, John Mulvey, Warren Powell, and Rob Vanderbei. In operations research, much of the early seminal work deals with "deterministic" problems in which the data are assumed to be fixed and available, but most real life problems involve data that are highly variable or missing. Replacing the distribution of values by their averages allows those deterministic solutions to be computed although it is widely known that such solutions are not optimal. In other words, we must embrace variation.

These thoughts came to mind as I started reading the *Algorithms to Live By* book by Christian and Griffiths. In the first 20 pages, they kept returning to the **altar of deterministic, precise engineering solutions**.

The book begins with the classic optimal stopping problem, in which one is faced with a sequential search problem at each step, one can either make the decision and stop searching, or delay the decision and keep searching. House buying is an example of such a problem. The authors pronounce that there is a provably optimal solution of spending first 37 percent of the house search period delaying the decision, after which one waits for the next house that is better than the previously seen.

They eventually do admit that this 37-percent solution is optimal only under a set of highly unrealistic assumptions, such as that one can never bid on a house one passed over before, and that the seller will always accept one's offer.

It's not the over-selling of the math that annoys me. Assumptions can be relaxed and complexity added to the base scenario. It's the constant harping on 37 percent. It's not 35 percent, it's not 39 percent. It is precisely 37 percent, and you better believe it!

No one should believe it because another huge assumption in these "provably optimal" solutions is that the data supplied to the problem are available, and precise. That's far from the truth! In a variant of the optimal stopping problem which the authors also describe, the decision-maker is supposed to have "full information" about the value of the house, that is to say, one knows the value of each house being viewed, expressed as the precise percentile in the distribution of all house values!

This leads to the hair-raising moment in which the authors declare on page 21, "if the cost of getting another offer is only a dollar, we’ll maximize our earnings by waiting for someone willing to offer us $499,552.79 and not a dime less." (By now, the problem setting has flipped to selling rather than buying a home.)

As formulated, the problem yields a "closed-form" formula, which spits out an answer to two decimal places. If one accepts uncertainty and embraces variation, one would not care about dimes.

***

I still plan on finishing the book, and I will write a detailed review when I finish. This post is not a refutation of the entire book. It's quite common to take formulas as god-given objects. Between unrealistic assumptions and uncertain data, one can only hope for general guidance and approximate solutions to these problems.

Absolutely. The misleading psuedo precision of models bedevils economics everywhere. Models should never have their results presented with more precision than the inputs. I've had someone present me with analysis that said 'building x is about 91.44 metres from the road'. I queried their use of 'about', and they had unthinkingly converted 'about 100 yards'

Posted by: Richard | 03/15/2018 at 11:10 PM