Even the ESPN announcers are realizing that the statistics being printed on the screen have little to do with what is transpiring in the World Cup matches. In a recent match, I noted that ESPN showed or spoke about these statistics: number of shots, number of shots on target, possession time (as proportion of playing time), total number of touches (or passes), number of corners, number of miles run.
A friend of mine asked me to predict the World Cup winner, which got me thinking about how one would go about predicting such a thing. One of the ideas in Chapter 2 of Numbers Rule Your World is that good statistical thinkers have a feel for the nature of problems. In sports, baseball is much easier to deal with than football (soccer) from a statistical perspective. Why?
Trying to predict the World Cup winner is fraught with challenges. Here are some starters:
- National teams do not play together regularly. They get together every four years to play in the World Cup, and in between, they play some continental championships like the Euro, and some inconsequential tournaments and/or friendlies. There are very few repetitions of competitive, meaningful matches between any two countries. Statistics require repetitions and this is one big problem.
- There are also not many situational repetitions. Unlike baseball, where every at-bat can be thought of as a repetition (or perhaps, every at-bat plus on-base configuration is a repetition), football is a fluid game, making it difficult to extract comparable situations. While corner kicks are easier to identify as "a situation", they too present challenges because teams use different offensive tactics and different defensive tactics and different overall strategies, which make one corner kick somewhat different from another. There are probably not more than a dozen corner kicks per match anyway.
- Like baseball, football also turns on one or a few moments during any given match. The Swiss had barely any action against the Spaniards except for that one moment. Whether the US-Slovenia match was tied or not came down to a blown call. Calling one football match is as difficult as calling a single baseball game.
- The types of data that are easiest to collect in a football match have very little predictive power... possession time really doesn't mean anything if one side is playing a defensive strategy; number of shots doesn't mean anything if one team is unable to penetrate into the box and keeps taking long-range shots; number of miles run also means little if one side is playing defensive tactics.
- Football is much more of a team game than baseball. Goals typically come about based on coordinated actions by several players, and so statistics that measure one player or the entire team are unlikely to capture the information needed to predict success.
- Football is very tactical. Statistics reflect innately the tactics adopted by the team in any given match. I think if teams hire statisticians to analyze their performance based on intimate knowledge of the tactics they were deploying during the match, the analyses could be very useful. It is much harder for outsiders without knowledge of the intended tactics to analyze performance effectively.
Predicting the winner of a match is probably not very useful except for those gambling on sports. What is, to me, a better question to ask is whether there are statistics that if provided during the course of the match will allow us to understand the relative strength of the two teams, how the match is going, and how it might end. For instance, in baseball, knowing the strike/ball ratio and the number of pitches thrown gives us a good sense of whether the pitcher is having a good or bad day.