« Some past talks | Main | The reality is most A/B tests fail, and Facebook is here to help »

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Sharon Machlis

Um, anecdotes are not data. Because ONE journalist may have written inaccurately about probability does not mean that "journalists" as a whole are ill-trained in writing about probability. One of my colleagues - a security reporter - has a masters in statistics. Can I put up my single data point against your single example to decide who is correct? Or can I come up with five examples of probability reported accurately to counteract your examples of statistics mistakes?

Unlearner

Search strategy can be improved by taking note of the fact that higher seeded teams are more likely to win. If you knew that the number of first-round upsets would be no more than eight (the actual number), you would then have to search only 15 million brackets. Somewhere there is an optimum in the number of Moronbot conspirators.

Kaiser

Sharon: Thanks for reading. Since you're a journalist, you'd understand the idea of a lede. I encourage you to browse around this blog for other stories about misuse of probability by the media. I'm hoping for the day when J school makes statistics/numbersense a required course. Now that would be something.

Unlearner: Not understanding why the actual number of first-round upsets this year matters. If you use that data point, is this algorithm going to work next year? This is the same problem of using exactly 25 matches in the original calculation. Why would 25 generalize to future years?

Unlearner

By Moronbot conspirators I mean contest entrants who coordinate to avoid duplicate entries and systematically probe the most likely outcomes. Fifteen million of them would have been able to probe 8-upset brackets (as well as all of those with fewer upsets), and so would have gotten the first round right this year. Pretending for the moment that the contest only concerned the first round, they would then have received about $70 each in prize money.

But if they had distributed their entries randomly over the set of possible first-round brackets, they would have had only a one in 500 chance, as you say, and thus the expected payout would be lower. There is nothing special about eight upsets, but the 8-upset possibilities are more probable than the 9-upset ones, and so should be probed first as the number of conspirators increases.

George Anders

Hi Kaiser Fung:

I'm the author of the Forbes article that intrigued you ... and I'm sorry to see that my playful language about MoronBot distracted you in a way that kept you from engaging on the true thesis of the piece.

I offered two variants of the MoronBot strategy. The first, as you rightly note, involves totally random brackets, treating all battles among seeds as equal. That's a very primitive strategy. We can agree that it has a 26% probability of creating one bracket that gets the first 25 games right. I make the point that this is no worse and perhaps slightly better than humans did this year. You make the point that it's still pretty feeble. We're both right ... but all the good analysis is yet to come.

In the seventh paragraph of the article, I write: "Take the simple step of steering the program toward the most likely possibilities, based on this historical data about upsets — and MoronBot is likely to do even better." I provide a link to this wonderful data about the frequencies of upsets in various seed vs. seed matchups. http://mcubed.net/ncaab/seeds.shtml

I wrap up the point in the next paragraph by talking about an "adjusted random" strategy, which builds brackets on the basis of what sorts of upsets are most likely. Call this MoronBot2.

Before you get too carried away with name-calling, I'd like to ask you to spend a little time reflecting on MoronBot2. Now we're building brackets that assume the following:

16 v. 1 upsets never occur. (see link above; 0/120 history)

15 v. 2 upsets occur about 6% of the time (7/120). So most of our MoronBot2 brackets do not include any such upsets, but a few do.

12 v. 5 upsets occur nearly 34% of the time (47/120), so we build a lot of brackets that include 1 or 2 such upsets. Perhaps a much smaller number with 0, 3 or 4 such upsets. And so on.

The result: MoronBot2 creates 8.7 million distinct brackets that represent the most likely ways of sprinkling upsets around. Our bracket-filling program makes no effort to know anything about specific teams. It's simply a way of distributing upsets on the basis of past tourneys' results. Figuring out the precise effectiveness of such a strategy is a complex task that requires a lot of hard math and a lot of simulations. But simple eyeballing suggests that it's a lot better than MoronBot1 -- and a much better hope of produce brackets that last longer than 25 games.

Did I explain the MoronBot2 strategy in detail in the piece? No, I didn't. You're allowed to beat up on me for murky writing. But I did include the link to the historical upset ratios. I did say that a better MoronBot strategy should be based on using those ratios. And I talked about an "adjusted random" approach to building brackets.

In this case, I think it's my writing that let you down -- not my statistics.

Ken

The problem is that most people optimise their strategy to maximise the probability of getting correct results, which if all the teams with maximum probability of winning did so, there would be a huge number of winners. A better strategy is to randomly assign wins based on the probability of winning, which is sort of what George is suggesting. If this was what people did then someone would probably pick the first 25. All it would take is a 60% success rate to have a couple of dozen winners.

What works in Buffets favour is that even with a 90% success rate the chances of getting all 63 correct is about 1 in a 1000. Even more in favour is that most people won't pick the team that has only a 10% probability of winning a particular game, so in that 10% of cases everyone will lose. Do it randomly and 10% will make the correct choice.

Kaiser

George: Thanks for the explanation. I think you're missing the points I raised.

First, the probability of winning you computed is conditional on 25 rounds, which is the actual result for this year but in no way generalizable. I show that even increasing that number by a few matches dramatically cut the probability of winning by 100 times. Moronbot2 does not improve odds by 100 times, or anywhere near it. And don't forget that surviving 25 rounds is far from winning Warren's bet.

Second, in any of these schemes, all 9 million entrants are sharing the prize.

Third, being able to specify a strategy is not evidence that it works.

Would love to hear what you think of these thoughts.

George Anders

Thanks, Kaiser, for the quick reply. We're getting to a better discussion with each iteration. A few thoughts on this end:

1a. Yes, trying to stay alive on the Buffett bet for all 63 rounds is as close to impossible as anything on this planet. But there's pride in at least surviving the first weekend (32 games), which no one did this year. And perhaps even last a bit longer. Make the man in Omaha bite his nails for at least a few days before he can declare victory.

That's the rationale for proposing MoronBot2.

1b. The benefits from switching to MoronBot2 are not trivial! If we assume that 1 seeds beat 16 seeds all the time (true so far) and that that 2 seeds beat 15 seeds often enough (94%) that our brackets should assume this is a certainty, we have now replaced eight coinflips in MoronBot1 with eight bet-on-the-favorite calls that have a much better chance of coming right.

How much better? By the rudimentary modeling of MoronBot1, coin-flips on the eight 16v1 and 15v2 games carried a total (0.5 ^ 8) probability, or 1/256. That's if we assume that everything is a total guess. It's even worse if we allow for the fact that MoronBot1 is often betting on a near-hopeless team in those matchups.

If we switch to the more sophisticated MoronBot2 strategy, it's chances in the eight games with the most heavily favorite better seed are ((1 ^4)(.94 ^4)) or about 78% right. That's if we assume that past odds in seed-v-seed matchups will continue. Two different modeling approaches.

Overall, that's a better than 100-fold improvement if we switch from a primitive to a more sophisticated model. MoronBot2 should deliver smaller improvements on the other seeds, too.

You're right that MoronBot2 is likely (in fact nearly certain) to blow up before for the tournament is over. But it should last longer than any other strategy. And there are bragging rights associated with staying alive in a tough contest for a long time. (See Norse mythology: Thor and the cat's paw.)

2. Yes, all entrants split the prize. No one can agree on the exact number of total entrants, but if we go with your 9 million, that's $111 per person. Given that all entries are free, that the process is entertaining, and that the bragging rights are considerable, I think that's an appealing lure, even if it isn't $1 billion. Besides, MoronBot2 has a fine chance of picking up the consolation prizes if no one wins the whole thing.

3. Of course. The correct test is to put MoronBot2 to work next year. I hope that someone does. If you'd like to co-develop it with me, I'd be delighted. We can blog about it and try to recruit entrants. Your analysis of MoronBot2's chances would be very engaging to read, and informative, too.

George Anders

In the just-posted comment above, I raced through one point too fast. Here's the full context on the paragraph: that begins: How much better xxx (changes are marked in caps.)

How much better? By the rudimentary modeling of MoronBot1, coin-flips on the eight 16v1 and 15v2 games carried a total (0.5 ^ 8) probability, or 1/256. That's if we assume that everything is a total guess. THE OVERALL EFFECT ON MORONBOT1's ACCURACY IS even worse if we allow for the fact that MoronBot1 HAS RUINED MANY OTHERWISE PROMISING BRACKETS BY betting on ONE OR MORE near-hopeless teams in matchups WITH EXTREME FAVORITES.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Marketing and advertising analytics expert. Author and Speaker. Currently at Vimeo and NYU. See my full bio.

Next Events

Aug: 20 DataViz New York Meetup

Aug: 26 Optimizely Experience, Invited Expert, New York

Past Events

See here

Junk Charts Blog



Link to junkcharts

Graphics design by Amanda Lee

Search3

  • only in Big Data

Community