Treating absolute and relative data simultaneously
How to print cash, graphically

Where but when and why: deaths of journalism

On Twitter, someone pointed me to the following map of journalists who were killed between 1993 and 2015.

Wherejournalistsarekilled

I wasn't sure if the person who posted this liked or disliked this graphic. We see a clear metaphor of gunshots and bloodshed. But in delivering the metaphor, a number of things are sacrificed:

  • the number of deaths is hard to read
  • the location of deaths is distorted, both in large countries (Russia) where the deaths are too concentrated, and in small countries (Philippines) where the deaths are too dispersed
  • despite the use of a country-level map, it is hard to learn the deaths by country

The Committee to Protect Journalists (CPJ), which publishes the data, used a more conventional choropleth map, which was reproduced and enhanced by Global Post:

Gp_wherejournalistskilled

They added country names and death counts via a list at the bottom. There is also now a color scale. (Note the different sets of dates.)

***

In a Trifecta Checkup, I would give this effort a Type DV. While the map is competently produced, it doesn't get at the meat of the data. In addition, these raw counts of deaths do not reveal much about the level of risk experienced by journalists working in different countries.

The limitation of the map can be seen in the following heatmap:

Redo_cpj_heatmap

While this is not a definitive visualization of the dataset, I use this heatmap to highlight the trouble with hiding the time dimension. Deaths are correlated with particular events that occurred at particular times.

Iraq is far and away the most dangerous but only after the Iraq War and primarily during the War and its immediate aftermath. Similarly, it is perfectly safe to work in Syria until the last few years.

A journalist can use this heatmap as a blueprint, and start annotating it with various events that are causes of heightened deaths.

***

Now the real question in this dataset is the risk faced by journalists in different countries. The death counts give a rather obvious and thus not so interesting answer: more journalists are killed in war zones.

A denominator is missing. How many journalists are working in the respective countries? How many non-journalists died in the same countries?

Also, separating out the causes of death can be insightful.

Comments

Andrew Gelman

Kaiser:

An additional problem with the map as shown is that it implies a false precision in the location (unless those U.S. journalists are all dying in Kansas). The choropleth map has the advantage of not misleading in this way: the data are at the country level and so the graph is presented that way.

I think the best approach would be some combination of the choropleth map and a time series. The map is good because it shows the global picture right away (although I do find the whole Alaska thing to be distracting; I'd almost like to just remove Alaska, Greenland, and a bunch of those northern islands from the map entirely, if only this wouldn't freak people out), then you can follow up with some time plot.

The comments to this entry are closed.