« June 2018 | Main | August 2018 »

Graphical advice for conference presenters

I've attended a number of talks in the last couple of days at the Joint Statistical Meetings. I'd like to offer some advice to presenters using graphics in their presentations.

Here is an example of the style of graphics that are being presented. (Note: I deliberately picked an example from a Google image search - this graphic was not used in a presentation but is representative of those I've seen.)

Example_presentation_graphic

Here are some tips to make your graphic much more impactful:

  • Use much larger font sizes. Typically, the same graphic published in a journal is used in the presentation. Other than the people sitting in the front row, no one can see any of the text, which means no one can understand anything. Most of us realize that for the bullet points on the slides, you have to pick a large font, say 20 points. The same goes for any labels or annotation on your graphics!
  • Use much thicker lines, larger dots, etc. Similar to the above, if you'd like people in the second to the last rows to be able to see your chart, you must enlarge everything. (For R users, cex comes in handy.)
  • Put a lot of text on the graphic itself. The graphic shown above has words but it lacks any context. In many of these presentations, the audience are statisticians, many of whom work in different industries or disciplines so we don't know what OpN, LIN, LIC mean. You may have explained this five slides prior but it's hard to expect the audience to remember. Why not just spell that out. Kendall's tau may be known to some in the audience but we still don't know - just based on what's on this chart - what correlation is being assessed. Any other text that helps explain what's on the chart should be added.
  • Add an informative title. These presentations are only 20 minutes long, and you'll spend maybe one minute explaining the graphic to someone who hasn't read the paper. You should spell out what is the message of your graphic - then we can look at the evidence to see how you drew that conclusion. In this example, it seems like there is a story around Flowering.
  • Avoid complex graphics. In a few occasions, the presenters show a grid of charts. These work well in a journal paper when we have time to figure out the layout. It's hard to grasp the message plus figure out how to read the chart all in a matter of a minute or so! Just like we recommend usually one message per slide, you should stick to one message per graphic used in an oral presentation.

The larger lesson is that the chart that is perfect for publication in a journal is less than perfect for an oral presentation.

 

PS. Please see here for an example of how one can remake the above chart for use in a conference presentation.


Some Tufte basics brought to you by your favorite birds

Someone sent me this via Twitter, found on the Data is Beautiful reddit:

Reddit_whichbirdspreferwhichseeds_sm

The chart does not deliver on its promise: It's tough to know which birds like which seeds.

The original chart was also provided in the reddit:

Reddit_whichbirdswhichseeds_orig_sm

I can see why someone would want to remake this visualization.

Let's just apply some Tufte fixes to it, and see what happens.

Our starting point is this:

Slide1

First, consider the colors. Think for a second: order the colors of the cells by which ones stand out most. For me, the order is white > yellow > red > green.

That is a problem because for this data, you'd like green > yellow > red > white. (By the way, it's not explained what white means. I'm assuming it means the least preferred, so not preferred that one wouldn't consider that seed type relevant.)

Compare the above with this version that uses a one-dimensional sequential color scale:

Slide2

The white color still stands out more than necessary. Fix this using a gray color.

Slide3

What else is grabbing your attention when it shouldn't? It's those gridlines. Push them into the background using white-out.

Slide4

The gridlines are also too thick. Here's a slimmed-down look:

Slide5

The visual is much improved.

But one more thing. Let's re-order the columns (seeds). The most popular seeds are shown on the left, and the least on the right in this final revision.

Slide6

Look for your favorite bird. Then find out which are its most preferred seeds.

Here is an animated gif to see the transformation. (Depending on your browser, you may have to click on it to view it.)

Redojc_birdsseeds_all_2

 

PS. [7/23/18] Fixed the 5th and 6th images and also in the animated gif. The row labels were scrambled in the original version.

 


Checking the scale on a chart

Dot maps, and by extension, bubble maps are popular options for spatial data; but the scale of these maps can be deceiving. Here is an example of a poorly-scaled dot map:

Farm-Dot Density

The U.S. was primarily an agrarian economy in 1997, if you believe your eyes.

Here is a poorly-scaled bubble map:

image from junkcharts.typepad.com

New Yorkers have all become Citibikers, if you believe what you see.

Last week, I saw a nice dot map embedded inside this New York Times Graphics feature on the destruction of the Syrian city of Raqqa.

Nyt_raqqa_dotmap

Before I conclude that the destruction was broadly felt, I'd like to check the scale on the map to make sure it doesn't have the problem seen above. What is helpful here is the scale provided on the map itself.

Nty_raqqa_scale

That line segment representing a quarter mile fits about 15 dots side by side. Then, I found out that a Manhattan avenue (longer) block is roughly a quarter mile. That means the map places about 15 buildings to an avenue block. In my experience, that sounds about right: I'd imagine 15-20 buildings per block.

So I'm convinced that the designer chose an appropriate scale to display the data. It is actually true that the entire city of Raqqa was pretty much annihilated by U.S. bombs.


Two good charts can use better titles

NPR has this chart, which I like:

Npr_votersgunpolicy

It's a small multiples of bumps charts. Nice, clear labels. No unnecessary things like axis labels. Intuitive organization by Major Factor, Minor Factor, and Not a Factor.

Above all, the data convey a strong, surprising, message - despite many high-profile gun violence incidents this year, some Democratic voters are actually much less likely to see guns as a "major factor" in deciding their vote!

Of course, the overall importance of gun policy is down but the story of the chart is really about the collapse on the Democratic side, in a matter of two months.

The one missing thing about this chart is a nice, informative title: In two months, gun policy went from a major to a minor issue for some Democratic voters.

***

 I am impressed by this Financial Times effort:

Ft_millennialunemploy

The key here is the analysis. Most lazy analyses compare millennials to other generations but at current ages but this analyst looked at each generation at the same age range of 18 to 33 (i.e. controlling for age).

Again, the data convey a strong message - millennials have significantly higher un(der)employment than previous generations at their age range. Similar to the NPR chart above, the overall story is not nearly as interesting as the specific story - it is the pink area ("not in labour force") that is driving this trend.

Specifically, millennial unemployment rate is high because the proportion of people classified as "not in labour force" has doubled in 2014, compared to all previous generations depicted here. I really like this chart because it lays waste to a prevailing theory spread around by reputable economists - that somehow after the Great Recession, demographics trends are causing the explosion in people classified as "not in labor force". These people are nobodies when it comes to computing the unemployment rate. They literally do not count! There is simply no reason why someone just graduated from college should not be in the labour force by choice. (Dean Baker has a discussion of the theory that people not wanting to work is a long term trend.)

The legend would be better placed to the right of the columns, rather than the top.

Again, this chart benefits from a stronger headline: BLS Finds Millennials are twice as likely as previous generations to have dropped out of the labour force.

 

 

 

 


Headless people invade London, chart claims

Some 10 days ago, Mike B. on Twitter forwarded me this chart from Time Out:

Timeout_londonpopulation_sm

Mike added: "Wow, decapitations in London have really gone up!"

A closer look at the chart reveals more problems.

The axis labels are in the wrong places. It appears that the second dot represents 1940 and the second-last dot represents 2020. There are 12 dots between those two labels, corresponding to three evenly-spaced labels. This works out to be  6.154 years between each dot, and 20.0 years between labels. The labels actually do not fall on top of the dots but between them! We have to conclude that the axis labels were independently applied onto the chart.

I found another chart of London's population growth from here.

Londonpopulationgrowth

Superimposing the two charts:

Redo_jc_timeout_londonpopulation_1

The lowest point seemed to be around 1990-ish in the second chart but in the Time Out chart, the reader most likely assumes it occurred around 2000.

***

What else? The Time Out chart has no vertical axis, and therefore, the chart fails to deliver any data to the reader: how many people actually live in London? This style - chart with no vertical axis - has been made popular by Google (Google Trends, etc.). 

Further, one should differentiate between historical data and projections. It seems like everything on the right side that exists above the previous peak in 1940 is projected.

***

Just for giggles, I made this:

Redo_jc_timeout_londonpopulation_2