Even the ESPN announcers are realizing that the statistics being printed on the screen have little to do with what is transpiring in the World Cup matches. In a recent match, I noted that ESPN showed or spoke about these statistics: number of shots, number of shots on target, possession time (as proportion of playing time), total number of touches (or passes), number of corners, number of miles run.
A friend of mine asked me to predict the World Cup winner, which got me thinking about how one would go about predicting such a thing. One of the ideas in Chapter 2 of Numbers Rule Your World is that good statistical thinkers have a feel for the nature of problems. In sports, baseball is much easier to deal with than football (soccer) from a statistical perspective. Why?
***
Trying to predict the World Cup winner is fraught with challenges. Here are some starters:
- National teams do not play together regularly. They get together every four years to play in the World Cup, and in between, they play some continental championships like the Euro, and some inconsequential tournaments and/or friendlies. There are very few repetitions of competitive, meaningful matches between any two countries. Statistics require repetitions and this is one big problem.
- There are also not many situational repetitions. Unlike baseball, where every at-bat can be thought of as a repetition (or perhaps, every at-bat plus on-base configuration is a repetition), football is a fluid game, making it difficult to extract comparable situations. While corner kicks are easier to identify as "a situation", they too present challenges because teams use different offensive tactics and different defensive tactics and different overall strategies, which make one corner kick somewhat different from another. There are probably not more than a dozen corner kicks per match anyway.
- Like baseball, football also turns on one or a few moments during any given match. The Swiss had barely any action against the Spaniards except for that one moment. Whether the US-Slovenia match was tied or not came down to a blown call. Calling one football match is as difficult as calling a single baseball game.
- The types of data that are easiest to collect in a football match have very little predictive power... possession time really doesn't mean anything if one side is playing a defensive strategy; number of shots doesn't mean anything if one team is unable to penetrate into the box and keeps taking long-range shots; number of miles run also means little if one side is playing defensive tactics.
- Football is much more of a team game than baseball. Goals typically come about based on coordinated actions by several players, and so statistics that measure one player or the entire team are unlikely to capture the information needed to predict success.
- Football is very tactical. Statistics reflect innately the tactics adopted by the team in any given match. I think if teams hire statisticians to analyze their performance based on intimate knowledge of the tactics they were deploying during the match, the analyses could be very useful. It is much harder for outsiders without knowledge of the intended tactics to analyze performance effectively.
***
Predicting the winner of a match is probably not very useful except for those gambling on sports. What is, to me, a better question to ask is whether there are statistics that if provided during the course of the match will allow us to understand the relative strength of the two teams, how the match is going, and how it might end. For instance, in baseball, knowing the strike/ball ratio and the number of pitches thrown gives us a good sense of whether the pitcher is having a good or bad day.
Was this for me? I seem to remember requesting a statistician's view on the likely World Cup winner.
Posted by: sparklydatepalm | 06/22/2010 at 08:59 AM
I think the approach ought to be:
The one major mind shift I would make here, is to not look at the probability of a team to beat another team, but instead look at the probability of a team to execute against a given game plan. Treat it like a chess match where a perfect game by white should always lead to victory. The statistic should show us which team is more likely to approach the perfect game. From there, maybe we could extrapolate the most likely winner
Posted by: Melih Onvural | 06/22/2010 at 04:07 PM
There are a two general tactics that are very effective in soccer: moving the ball by passing and challenging possession of the ball. Passing is important because it moves the ball up the field faster than players can run; players can always beat another player, but they can never beat the ball. Challenging is important because it doesn't give the other team time to set up or think; you force your opponent into sub-optimal decisions.
For forwards and mid-fielders, attempts on goal is also important; the more attempts you make, the more you'll score.
Since, as you note, there are very few opportunities to collect stats on the national teams, it seems that we might do well to collect stats on individual players and then aggregate them to compare teams.
Some naive possibilities for player-level stats: passes complete per game; passes received per game; attempts on goal per game; fraction of goals to attempts. Measuring "challenges" at the player level seems more difficult. Perhaps average distance and speed while defending. Maybe also average duration of possession, with less being better.
Posted by: Tom Hopper | 06/27/2010 at 04:06 AM