Visualizing alternative outcomes in fantasy football
Jul 18, 2013
I generated a big data set when writing Chapter 8 of Numbersense. This chapter discusses the question of how to measure your skills in managing/coaching a fantasy sports team. The general statistical question is how to separately measure two factors that both contribute to a single outcome.
In fantasy football (NFL), there is a matchup every week. Each week, you pick nine players from a roster of 14 players (rules vary by league). These nine players will score points for your team, based on how those players actually perform in real-life NFL games that week. You notch a win that week if your team scores more points than your opponent's team.
There are many ways to pick 9 players out of 14. In fact, in any given week, there are 200-300 eligible squads, of which only one is fielded. My big data set consists of all possible squads for every week for every team in the league. This data set contains rich information; the challenge is how to surface the information.
Visualization comes to the rescue. I'll be posting a series of charts here. Today's is the first one.
There are 13 plots, each of which represents a week of the season. The 13 plots trace the decisions of a single team over the course of the season. In each plot, the vertical line indicates the points total for the 9-player squad that was actually fielded by the team owner.
The histogram shows the range of choices the team owner could have made each week. Recall there are 200-300 possible squads of nine players from which the owner selected one. For example, in week 1, the owner didn't choose very well; there are many other sets of 9 players he could have chosen that would have scored him more points (the area to the right of the vertical line).
In Week 4, though, the owner could not have done much better. There were very few changes he could have made that would have increased his points total. Similarly, in Weeks 5 and 8.
You can also see that in Week 7, the 15 players he owned all tanked (in real life). The entire histogram is on the left side, meaning the points totals are horrible. Contrast this with Week 13, when the histogram is located on the right side of the chart, implying that this team owner would score pretty high no matter which 9 players he fielded.
You can get a copy of Numbersense here. Or enter the book giveaway quiz to try your luck.
"In fact, in any given week, there are 200-300 eligible squads, of which only one is fielded."
There's 2,002 ways to pick 9 players out of 14... is there some other restriction going on here?
Posted by: Tom West | Jul 18, 2013 at 09:18 AM
Players are limited to certain positions. A quarterback, for example, can't be played as a receiver.
Posted by: Andrew | Jul 18, 2013 at 09:47 AM
I'm really confused with your descriptions compared to what I see in the charts.
For example, you say Week 4 was great, and Week 7 was terrible... But they look virutally identical to me. What am I missing here?
Posted by: Aaron | Jul 18, 2013 at 01:26 PM
I had to stare at those, too. It's the position of the histogram within the box. The horizontal scale is on #13 and applies to each box.
Posted by: Jeff | Jul 18, 2013 at 01:43 PM
Aaron: yes, I think ggplot should have included horizontal scale on each column of charts.
Jeff's right. There are two things to look for in these charts: one is the location (& range) of the histograms which tells you how good your set of 15 players are, then you want to look at where the vertical line cuts the histograms (which tells you how good your activated squad of 9 was relative to alternatives).
Posted by: Kaiser | Jul 18, 2013 at 04:09 PM
Should add that there is probably a way to include horizontal scale on each column of charts. I just used the default for facet_wrap.
Posted by: Kaiser | Jul 18, 2013 at 04:10 PM