If you live in the States, and particularly a blue state, in the last year or two, it has been drilled into your head that Hillary Clinton was the overwhelming favorite to win the Presidential election. On the day before the election, when all the major media outlets finalized their "election forecasting models," they unanimously pronounced Clinton the clear winner, with a probability of winning of 70% to 99%. One should not sugarcoat these forecasts; they pointed to a clearcut Clinton victory - even the least aggressive number issued by trailblazer Nate Silver of FiveThirtyEight. In fact, on the eve of the election, Twitter was ablaze - Huffington Post threw a grenade at Nate, arguing that his prediction was too soft, not giving Clinton her fair due. Other hit jobs included a post on Daily Kos (link) and a comment from a respected Microsoft researcher on predictions (link).
One of the biggest and quickest stories to emanate from the shocking election result is the supposed de-legitimization of the election forecasting business. Many people have come up to me to mock these forecasters, and pronounce the death of the polling business. I will leave the polling business for another day - I don't believe it is not going away. The polling business is not the same as the election forecasting business. The two industries are being confounded because the election forecasters keep pointing their fingers at polls when their big calls fell short.
tl;dr Citizens shouldn't care about this election forecasting business. It's there for the benefit of the politicians. The forecasters have been over-selling these predictive modeling technologies. This is especially true if they are merely aggregating polls. If you can't validate the accuracy of such models, how much time/money are you willing to spend on them? The prediction markets people are quiet but they too have no clothes. Journalists should spend less time writing code, and get on the road and talk to real voters.
Nate Silver was the pioneer in this election forecasting business. While some academics have developed models for forecasting elections prior to Nate, his fivethirtyeight blog burst onto the scene and attracted a following. The New York Times took notice, and licensed his blog; from that perch, he developed a mass-market brand. Eventually, he jumped ship, landing at ESPN, which funded an expanded data journalism venture, that included sports forecasting, among other endeavors. (Disclosure: I have written features for FiveThirtyEight.)
After Nate moved out, the New York Times filled the void with a competing blog called The Upshot, where Nate Cohn took over the election forecasting business. In the meantime, other outlets such as Huffington Post and Princeton Election Consortium (by Princeton neuroscientist Sam Wang), joined the bandwagon, developing their own take on forecasting elections. For the past year or so, you can't visit any of these sites without having the latest election odds glaring down at you.
So much happened so quickly one may not realize that there is precious little history here. Nate SIlver's reputation rests on calling 49 out of 50 states correctly in the 2008 election, and then 50 out of 50 states in the 2012 election. Most of the other forecasters have only one election under their belt.
Therein lies the first problem - the election forecasting business has been dramatically oversold to the public. And yet the political forecasters are not alone; venture capitalists and the technology press have invested out-sized attention and untold dollars into the so-called predictive analytics "revolution," creating the myth that "big data" allow us to predict almost anything.
I have often reminded readers in my blog and books that all such models make errors, and frequently, errors that are significant and material. I have been alarmed by the lack of data to support the purported magic of these predictive models. Most articles that glorify this industry are heavy on hearsay and light on scientific evidence.
The 2016 election presented an emperor-has-no-clothes moment. Recall that the election forecasting business was built on top of (almost) 50-out-of-50 track records. In this election, most models got six or seven states wrong. By that metric, their performance is roughly 44 out of 50 states. That might sound like an A- (88%). Reasonable only if one ignores that the minimum grade given out is a B+.
We have been using the wrong metric (scale) all along. Everyone acknowledges that only about 10 states are truly competitive ("swing states"). The other night, New York was called for Clinton almost immediately after the polls closed, with about 3000 votes counted. There is no glory in calling New York correctly. When put onto the right scale, Nate called 10 of 10 in 2012 and 42 of 10 in 2016. Oops.
The election forecasters prefer that we don't tally things up this way, although they didn't complain when supporters previously cited the 50-out-of-50 statistic. The reason they provide probabilistic forecasts is that no one can be certain of an election outcome. That is a nice soundbite but the actions of Huffington Post and Daily Kos, among others, in calling out Nate Silver on the eve of the election suggest that they have become over-confident in their forecasting skill. They started to believe their own hype.
Probabilistic forecasts are very difficult to validate, especially for an event that happens only once every four years. By definition, swing states have close contests, with both parties roughly splitting the votes. Much larger samples are required to validate such calls.
Even for Nate Silver, we have his track record on three elections only, not enough to confirm his forecasting skill. The one big miss doesn't doom his work, just like the 2012 grand slam doesn't make him a genius. However, I really like his final post before the 2016 election, laying out the various factors that could upset his forecast. It is through this type of writing that many experts gained respect for his work.
For anyone invested in Big Data forecasting, you should ask yourself whether it is possible to measure the performance of the forecasting models. The U.S. Presidential election has a simple, (essentially) binary outcome, and that is already easier to validate, compared to many other domains. The other prominent failure - Google Flu Trends - also has the characteristic that a ground truth is available for a proper evaluation.
Take the Get Out the Vote predictions (another topic for a different day): if, heeding model prediction, Clinton never visited Wisconsin and then lost the state, does this show that the prediction that going to Wisconsin will yield no benefit is wrong? Since she did not visit Wisconsin, we could not have known what would have happened if she had gone there! The world is filled with similar situations; most predictions are difficult to evaluate.
If you cannot properly measure the performance of a prediction model, how much money/time are you willing to invest in it?
As the following graph from Andrew Gelman shows, the polling errors at the state level were not that egregious, amounting to an average two-percent error.
However, the errors are not evenly distributed across states. The errors are concentrated on red states, and they all erred in the same direction - the polls in red states consistently under-estimated the Trump vote.
This type of error is called "bias". Something systematic was skewing those red-state polls. It could be that Trump supporters tend not to respond to polls, perhaps out of distrust. It could be that women who intended to vote for Trump did not want to say it publicly - not that far-fetched if you recall Madeleine Albright's special reservation in hell for them. As Nate Silver pointed out, there were enough undecided voters to move the needle. The pollsters will be dissecting their sample populations to find the source(s) of the under-estimation.
Some people argue that one's faith in forecasting models should not be shaken, as the shocking Trump triumph is a scenario predicted by these models, and described by the other side of the 70% or 90% probabilities. That's another way of saying all forecasts come with a margin of error. I have two issues with this argument. All of the media outlets presented their prediction as a single number, typically with the spurious precision of one decimal place. The uncertainty around their predictions was swept under the rug. Some forecasters were so smitten by their confidence they started flame wars on the eve of the election, faulting Nate Silver for not being sure enough!
The margin of error is supposed to capture polling variability. If variability is the issue - and not bias, the Gelman chart above would show a different pattern: we should expect the errors to be spread out between red and blue states, and both above and below the diagonal.
The election forecasters also tell us everything is fine, they just need better data. Thus they deflect questions to the pollsters. The forecasters contend that their models amount to a sophisticated way to aggregate poll results. There is not much they could do about biased polls.
This leads to the critical question of the moment: why do we need an election forecasting business?
Having election forecasts does not advance our democracy. A citizen does not need to know the probability of Clinton or Trump winning to decide how he or she should vote. A citizen does not need to know how the prospect of a candidate's victory swings up or down with each passing poll weeks and months before the election takes place. If 70% or 99% were the wrong numbers to publish, what should have been the right ones - can we even answer this question after the fact?
If the forecasters are not "unskewing" the polls, and are merely aggregating them, what is their value add?
The answer may be political navel-gazing as a form of entertainment. The forecasters generate fodder for banter, such as which states are critical and what are the potential paths to victory. Maybe the following of Nate Silver and his imitators will stick around.
This election forecasting business is much more important to the politicians than to the rest of us. It helps them gauge their momentum, allocate resources, target their outreach, tailor their messaging, rally their troops, and so on. For all these reasons, they need data. They need quality data, which comes from repetitive polls, and smart analyses, including unskewing. They want our data.
This is one of the unspoken truths of the data business. Many entities want our data. They find ways to get us to give them the data, usually for free. Trickery and coercion are two popular strategies. Then, they make a profit out of this data. In some cases, the data benefit us directly but in many cases, the data enrich them, and sometimes, what the data we give up end up hurting us.
I don't think the election forecasting business hurts us but it isn't helping us either. This computing-intensive business is keeping people in front of their computers. Instead, the journalists should be criss-crossing the country, interviewing real voters, investigating, taking us beyond the talking points pumped out by the two parties.
One group of prognosticators are conspicuously silent this election cycle, probably laying low, hoping the storm would pass. We are talking about the "prediction markets" people. You know, the "wisdom of the crowds." The people who disparage "experts" and eulogize the "marketplace" where people bid real money. Where are the grandiose claims that these prediction markets can predict almost anything better?
Yes, here it is. This comes from the Election Betting Odds site that aggregates data from the BetFair marketplace:
So, what did the crowd think on the eve of the election?