It strikes me that the media loves to talk about probability, a subject about which journalists are ill-trained to write. The latest example of this is Forbes' attempt to draw a lesson out of the Warren Buffett's gimmicky $1 billion NCAA pool. As we all learned, by the time the 25th match drew to a close, all 8.7 million entrants have gotten at least one winner wrong, thus there would be no payout. (There are 32 matches in the first round of the tournament.)
The author calls this "human folly, bots' wisdom." Specifically, he said "A simple, randomized computer program probably would have done a better job of sizing up the upsets than we did."
It's not clear what he meant by a "randomized computer program". I'd assume he means the computer that selects at random one of the 4.3 billion possible first-round brackets. If so, the computer's chance of getting the right one is also remote.
We all know friends who spent hours or days slicing and dicing data to fill out their brackets. Nate Silver posted his methodology here, for example. It's unclear how many of the 8.7 million entries were purely "human".
***
When the columnist moves to the numbers, confusion reigns:
If people had filled out their brackets totally randomly, it would have taken 2 to the 25th, or 34 million people, to cover every single possibility. So with 8.7 million entrants, a simple computer program (call it MoronBot) would have enjoyed about a 26% chance of coming up with at least one surviving contender.
If we take this at face value, then on average, three out of four years, there would be no one left after the first 25 matches, even if every entry is generated by MoronBot. So it isn't surprising that no one got past the first 25 this year in particular.
What is lost in the math is the fact that the entire setup is a gimmick. Buffett has already determined that the chance of payout is zero. And it doesn't matter whether humans or bots are entering the stakes. This is precisely why the entry fee to this pool is, well, zero. Ever heard of no free lunch?
***
What is driving the odds is the exponential multiplier. Let's re-do the Forbes calculation but considering all 32 teams in the first round. Now, there are 4.3 billion unique first-round brackets (up from 34 million)! Even if all 8.7 million entrants were distinct brackets, they would constitute only 0.2 percent of possible brackets. The odds have dropped over 100-fold just by observing seven more first-round matches. It would be hundreds of years before we expect to find one perfect first-round bracket.
Worse than that, if this hundreds-year event happened, 8.7 million people will share the $1 billion prize since they are all now part of Moronbot. Yes, that means each entrant gets $115, but it's paid in 40 annual installments. Well, that's $3 per year, before taxes.
My advice: just have fun picking your bracket, and focus on your office pool.
***
The columnist went on to praise Big Data and artificial intelligence. However, the only analysis provided in the piece relates to a Moronbot making random picks, which requires zero intelligence. This is followed by a handwaving argument saying if a computer achieved X by making random picks, then the same computer will do X+Y by using some more intelligent algorithm, where Y is a positive number. In the meantime, if a human uses his or her brain, then Y is a negative number. What is X? What is Y? and what are these algorithms? and how are these different from models created by the Nate Silvers of the world?
Um, anecdotes are not data. Because ONE journalist may have written inaccurately about probability does not mean that "journalists" as a whole are ill-trained in writing about probability. One of my colleagues - a security reporter - has a masters in statistics. Can I put up my single data point against your single example to decide who is correct? Or can I come up with five examples of probability reported accurately to counteract your examples of statistics mistakes?
Posted by: Sharon Machlis | 04/11/2014 at 09:20 AM
Search strategy can be improved by taking note of the fact that higher seeded teams are more likely to win. If you knew that the number of first-round upsets would be no more than eight (the actual number), you would then have to search only 15 million brackets. Somewhere there is an optimum in the number of Moronbot conspirators.
Posted by: Unlearner | 04/11/2014 at 09:39 AM
Sharon: Thanks for reading. Since you're a journalist, you'd understand the idea of a lede. I encourage you to browse around this blog for other stories about misuse of probability by the media. I'm hoping for the day when J school makes statistics/numbersense a required course. Now that would be something.
Unlearner: Not understanding why the actual number of first-round upsets this year matters. If you use that data point, is this algorithm going to work next year? This is the same problem of using exactly 25 matches in the original calculation. Why would 25 generalize to future years?
Posted by: Kaiser | 04/11/2014 at 10:27 AM
By Moronbot conspirators I mean contest entrants who coordinate to avoid duplicate entries and systematically probe the most likely outcomes. Fifteen million of them would have been able to probe 8-upset brackets (as well as all of those with fewer upsets), and so would have gotten the first round right this year. Pretending for the moment that the contest only concerned the first round, they would then have received about $70 each in prize money.
But if they had distributed their entries randomly over the set of possible first-round brackets, they would have had only a one in 500 chance, as you say, and thus the expected payout would be lower. There is nothing special about eight upsets, but the 8-upset possibilities are more probable than the 9-upset ones, and so should be probed first as the number of conspirators increases.
Posted by: Unlearner | 04/11/2014 at 11:47 AM
Hi Kaiser Fung:
I'm the author of the Forbes article that intrigued you ... and I'm sorry to see that my playful language about MoronBot distracted you in a way that kept you from engaging on the true thesis of the piece.
I offered two variants of the MoronBot strategy. The first, as you rightly note, involves totally random brackets, treating all battles among seeds as equal. That's a very primitive strategy. We can agree that it has a 26% probability of creating one bracket that gets the first 25 games right. I make the point that this is no worse and perhaps slightly better than humans did this year. You make the point that it's still pretty feeble. We're both right ... but all the good analysis is yet to come.
In the seventh paragraph of the article, I write: "Take the simple step of steering the program toward the most likely possibilities, based on this historical data about upsets — and MoronBot is likely to do even better." I provide a link to this wonderful data about the frequencies of upsets in various seed vs. seed matchups. http://mcubed.net/ncaab/seeds.shtml
I wrap up the point in the next paragraph by talking about an "adjusted random" strategy, which builds brackets on the basis of what sorts of upsets are most likely. Call this MoronBot2.
Before you get too carried away with name-calling, I'd like to ask you to spend a little time reflecting on MoronBot2. Now we're building brackets that assume the following:
16 v. 1 upsets never occur. (see link above; 0/120 history)
15 v. 2 upsets occur about 6% of the time (7/120). So most of our MoronBot2 brackets do not include any such upsets, but a few do.
12 v. 5 upsets occur nearly 34% of the time (47/120), so we build a lot of brackets that include 1 or 2 such upsets. Perhaps a much smaller number with 0, 3 or 4 such upsets. And so on.
The result: MoronBot2 creates 8.7 million distinct brackets that represent the most likely ways of sprinkling upsets around. Our bracket-filling program makes no effort to know anything about specific teams. It's simply a way of distributing upsets on the basis of past tourneys' results. Figuring out the precise effectiveness of such a strategy is a complex task that requires a lot of hard math and a lot of simulations. But simple eyeballing suggests that it's a lot better than MoronBot1 -- and a much better hope of produce brackets that last longer than 25 games.
Did I explain the MoronBot2 strategy in detail in the piece? No, I didn't. You're allowed to beat up on me for murky writing. But I did include the link to the historical upset ratios. I did say that a better MoronBot strategy should be based on using those ratios. And I talked about an "adjusted random" approach to building brackets.
In this case, I think it's my writing that let you down -- not my statistics.
Posted by: George Anders | 04/11/2014 at 07:10 PM
The problem is that most people optimise their strategy to maximise the probability of getting correct results, which if all the teams with maximum probability of winning did so, there would be a huge number of winners. A better strategy is to randomly assign wins based on the probability of winning, which is sort of what George is suggesting. If this was what people did then someone would probably pick the first 25. All it would take is a 60% success rate to have a couple of dozen winners.
What works in Buffets favour is that even with a 90% success rate the chances of getting all 63 correct is about 1 in a 1000. Even more in favour is that most people won't pick the team that has only a 10% probability of winning a particular game, so in that 10% of cases everyone will lose. Do it randomly and 10% will make the correct choice.
Posted by: Ken | 04/11/2014 at 08:01 PM
George: Thanks for the explanation. I think you're missing the points I raised.
First, the probability of winning you computed is conditional on 25 rounds, which is the actual result for this year but in no way generalizable. I show that even increasing that number by a few matches dramatically cut the probability of winning by 100 times. Moronbot2 does not improve odds by 100 times, or anywhere near it. And don't forget that surviving 25 rounds is far from winning Warren's bet.
Second, in any of these schemes, all 9 million entrants are sharing the prize.
Third, being able to specify a strategy is not evidence that it works.
Would love to hear what you think of these thoughts.
Posted by: Kaiser | 04/11/2014 at 11:09 PM
Thanks, Kaiser, for the quick reply. We're getting to a better discussion with each iteration. A few thoughts on this end:
1a. Yes, trying to stay alive on the Buffett bet for all 63 rounds is as close to impossible as anything on this planet. But there's pride in at least surviving the first weekend (32 games), which no one did this year. And perhaps even last a bit longer. Make the man in Omaha bite his nails for at least a few days before he can declare victory.
That's the rationale for proposing MoronBot2.
1b. The benefits from switching to MoronBot2 are not trivial! If we assume that 1 seeds beat 16 seeds all the time (true so far) and that that 2 seeds beat 15 seeds often enough (94%) that our brackets should assume this is a certainty, we have now replaced eight coinflips in MoronBot1 with eight bet-on-the-favorite calls that have a much better chance of coming right.
How much better? By the rudimentary modeling of MoronBot1, coin-flips on the eight 16v1 and 15v2 games carried a total (0.5 ^ 8) probability, or 1/256. That's if we assume that everything is a total guess. It's even worse if we allow for the fact that MoronBot1 is often betting on a near-hopeless team in those matchups.
If we switch to the more sophisticated MoronBot2 strategy, it's chances in the eight games with the most heavily favorite better seed are ((1 ^4)(.94 ^4)) or about 78% right. That's if we assume that past odds in seed-v-seed matchups will continue. Two different modeling approaches.
Overall, that's a better than 100-fold improvement if we switch from a primitive to a more sophisticated model. MoronBot2 should deliver smaller improvements on the other seeds, too.
You're right that MoronBot2 is likely (in fact nearly certain) to blow up before for the tournament is over. But it should last longer than any other strategy. And there are bragging rights associated with staying alive in a tough contest for a long time. (See Norse mythology: Thor and the cat's paw.)
2. Yes, all entrants split the prize. No one can agree on the exact number of total entrants, but if we go with your 9 million, that's $111 per person. Given that all entries are free, that the process is entertaining, and that the bragging rights are considerable, I think that's an appealing lure, even if it isn't $1 billion. Besides, MoronBot2 has a fine chance of picking up the consolation prizes if no one wins the whole thing.
3. Of course. The correct test is to put MoronBot2 to work next year. I hope that someone does. If you'd like to co-develop it with me, I'd be delighted. We can blog about it and try to recruit entrants. Your analysis of MoronBot2's chances would be very engaging to read, and informative, too.
Posted by: George Anders | 04/12/2014 at 02:21 PM
In the just-posted comment above, I raced through one point too fast. Here's the full context on the paragraph: that begins: How much better xxx (changes are marked in caps.)
How much better? By the rudimentary modeling of MoronBot1, coin-flips on the eight 16v1 and 15v2 games carried a total (0.5 ^ 8) probability, or 1/256. That's if we assume that everything is a total guess. THE OVERALL EFFECT ON MORONBOT1's ACCURACY IS even worse if we allow for the fact that MoronBot1 HAS RUINED MANY OTHERWISE PROMISING BRACKETS BY betting on ONE OR MORE near-hopeless teams in matchups WITH EXTREME FAVORITES.
Posted by: George Anders | 04/12/2014 at 02:28 PM