« Round-up of up-coming events | Main | After seeing this chart, my mouth needed a rinse »



For some definition of the word "directly" at least. I doubt any R person would consider this contorting:


dat <- data.frame(year=2010:2015,
penalties=c(627, 625, 653, 617, 661, 730))

avg <- data.frame(val=mean(head(dat$penalties, -1)),

gg <- ggplot(dat, aes(x=year, y=penalties))
gg <- gg + geom_point()
gg <- gg + scale_x_continuous(breaks=c(2010, 2014, 2015))
gg <- gg + scale_y_continuous(breaks=c(600, 650, 700, 750),
limits=c(599, 751), expand=c(0,0))
gg <- gg + geom_segment(data=avg, aes(x=2010, xend=2015, y=val, yend=val), linetype="dashed")
gg <- gg + geom_segment(data=avg, aes(x=2015, xend=2015, y=val, yend=last), color="steelblue")
gg <- gg + geom_point(data=avg, aes(x=2015, y=val), shape=4)
gg <- gg + geom_point(data=avg, aes(x=2015, y=700), shape=17, col="steelblue")
gg <- gg + labs(x=NULL, y="Number of Penalties",
title="NFL Penalties Jumped 15% in the\nFirst 3 Weeks of the 2015 Season\n")
gg <- gg + theme_bw()
gg <- gg + theme(panel.grid.minor=element_blank())
gg <- gg + theme(panel.grid.major.x=element_blank())
gg <- gg + theme(axis.ticks=element_blank())


"NFL Penalties Jumped 15% in the First 3 Weeks of the 2015 Season"

I really don't like that sentence. It tells me that there were 635 penalties in the last three weeks of the 2014 season, which is probably not true. I would prefer:

"Penalties were 15% higher in the 2015 NFL Season's First 3 Weeks than in the average of the previous five seasons' first 3 weeks"

Picky but more accurate, IMO.


Or you can use base R. Shorter code...
Here's the result, also with my idea for the title.

pen <- c(627, 625, 653, 617, 661, 730)
pen_av <- mean(head(pen,-1))
plot(2010:2015, pen, ylim=c(600,750), las=1, bty="n",
ylab="number of penalties", xlab="", pch=4, lwd=3, cex=1.3)
abline(h=pen_av, lty=2)
arrows(x0=2015, y0=pen_av, y1=tail(pen,1)-5, col="blue", lwd=2)
title(main="NFL Penalties jumped 15%\n in the first 3 weeks of 2015\ncompared to previous seasons")
text(2014, pen_av, "5 year average", adj=c(0,1.3))
text(2015.1, 700, "15%\nincr.", adj=0, col="blue", xpd=TRUE)


There is a problem with the link to Business Insider.

I find the whole argument (about the games not the graphs they have problems) rather dubious, as all that is happening is probably random variation. If it matters they should be modifying the rules to make sure there is a greater time spent playing so that when the penalties are higher it is still a good game. One point of relevance is that truncating the y axis has the effect of distorting our perception of whether the variation is random. Having the scale from zero would show that not that much is happening.

My other suggestion is to ignore the NFL and watch the rugby world cup.


Thanks all for the comments.

Ken: I certainly don't intend to provide credence for the underlying analysis.

R coders: R is one of the tools I considered for this. The contortions (for me) include placing the text which also requires specifying the xlim, picking colors, picking symbols, etc. Of course, I can do it; I just know other tools that take less time. Thanks for the code samples though.


Hi Kaiser, I really enjoy reading your blog. Thank you.

What 'tool' do you use for the sketches you present in your posts?


Will: the first take was created in JMP's Graph Builder which is great for sketching. Then I take it to a drawing program to customize labels, text, arrows, etc.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Marketing analytics and data visualization expert. Author and Speaker. Currently at Columbia. See my full bio.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Good Books

Keep in Touch

follow me on Twitter