The chart on the top is published, depicting the quite dramatic flattening of the growth in average spending over the last years--average being the total spend divided by the number of Medicare recipients. The other point of the story is that the decline is unexpected, in the literal sense that the Congressional Budget Office planners did not project its magnitude. (The planners did take the projections down over time so they did project the direction correctly.)
Meanwhile, Cairo asked for a chart of total spend, and Kevin Quealy obliged with the chart shown at the bottom. It shows almost straight line growth.
Cairo's point is that the average does not give the full picture, and we should aim to "show all the relevant data".
I want to follow that line of thinking further.
My first reaction is Cairo did not say "show all the data", he said "show the relevant data". That is a crucial difference. For complex social problems like Medicare, and in general, for "Big Data", it is not wise to show all the data. Pick out the data of interest, and focus on those.
A second reaction. How can "relevance" be defined? Doesn't it depend on what the question is? Doesn't it depend on the interests and persuasion of the chart designer (or reader)? One of the key messages I wish to impart in my book Numbersense (link) is that reasonable people using uncontroversial statistical methods to analyze the same dataset can come to different, even opposite, conclusions.
Statistical analysis is concerned with figuring what is relevant and what isn't. This is no different from Nate Silver's choice of signal versus noise. Noise is not just what is bad but also what is irrelevant.
In practice, you present what is relevant to your story. Someone else will do the same. The particular parts of the data that support each story may be different. The two sides have to engage each other, and debate which story has a greater chance of being close to the truth. If the "truth" can be verified in the future, the debate is more easily settled.
Unfortunately, there is no universal standard of relevance.
Going back to the NYT story. The chart on total Medicare spending is not as useful as it may seem. This is because an aggregate metric like this for a social phenomenon is influenced by a multitude of factors. Clearly, population growth is a notable factor here. When they use the word "real", I don't know if this means actualized (as opposed to projected), or "in real terms" (that is, inflation adjusted). If not the latter, the value of money would be another factor affecting our interpretation of the lines.
Without some reference levels for population and value of money, it is hard to interpret whether the straight-line growth implies higher or lower spending intensity. For the second chart, I suggest plotting the growth in the number of Medicare recipients. I believe one of the goals of the Affordable Care Act is to reduce the ranks of the uninsured so a direct depiction of this result is interesting.
The average spend can be thought of as population-adjusted. It is a more interpretable number -- but as Cairo pointed out, it is also narrow in scope. This is a tradeoff inherent in all of statistics. To grow understanding, we narrow the scope; but as we focus, we lose the big picture. So, we compile a set of focal points to paint a fuller picture.
Alberto links to a nice Propublica chart on average annual spend per dialysis patient on ambulances by state. (link to chart and article)
It's a nice small-multiples setup with two tabs, one showing the states in order of descending spend and the other, alphabetical.
In the article itself, they excerpt the top of the chart containing the states that have suspiciously high per-patient spend.
Several types of comparisons are facilitated: comparison over time within each state, comparison of each state against the national average, comparison of trend across states, and comparison of state to state given the year.
The first comparison is simple as it happens inside each chart component.
The second type of comparison is enabled by the orange line being replicated on every component. (I'd have removed the columns from the first component as it is both redundant and potentially confusing, although I suspect that the designer may need it for technical reasons.)
The third type of comparison is also relatively easy. Just look at the shape of the columns from one component to the next.
The fourth type of comparison is where the challenge lies for any small-multiples construction. This is also a secret of this chart. If you mouse over any year on any component, every component now highlights that particular year's data so that one can easily make state by state comparisons. Like this for 2008:
You see that every chart now shows 2008 on the horizontal axis and the data label is the amount for 2008. The respective columns are given a different color. Of course, if this is the most important comparison, then the dimensions should be switched around so that this particular set of comparisons occurs within a chart component--but obviously, this is a minor comparison so it gets minor billing.
I love to see this type of thoughtfulness! This is an example of using interactivity in a smart way, to enhance the user experience.
The Boston subway charts I featured before also introduce interactivity in a smart way. Make sure you read that post.
Also, I have a few comments about the data analysis on the sister blog.
Carl Bialik used to be the Numbers Guy at Wall Street Journal - he's now with FiveThirtyEight. Apparently, he left a huge void. John Eppley sent me to this set of charts via Twitter.
This chart about Citibike is very disappointing.
Using the Trifecta checkup, I first notice that it addresses a stale question and produces a stale answer. The caption below the chart says "the peak times ... seem to be around 9 am and 6 pm." What a shock!
I sense a degree of meekness in usnig "seem to be". There is not much to inspire confidence in the data: rather than the full statistics which you'd think someone at Citibike has, the chart is based on "a two-day sample last autumn". The number of days is less concerning than the question of whether those two autumn days are representative of the year. Curious readers might want to know what data was collected, how it was collected, and the sample size.
Finally, the graph makes a mess of the data. While the black line appears to be data-rich, it is not. In fact, the blue dots might as well be randomly scattered and connected. As you can see from the annotations below, the scale of the chart makes no sense.
Plus, the execution is sloppy, with a missing data label.
The next chart is not much better.
The biggest howler is the choice of pie charts to illustrate three numbers that are not that different.
But I have to say the chart raises more questions than it answers. I am not an expert in pregnancy but doesn't a pregnant woman's weight include the weight of the baby she's carrying? So the more weight the woman gains, on average, the heavier is her baby. What a shock!
The last and maybe the least is this chart about basketball players in the playoff.
It's the dreaded bubble chart. The players are arranged in a perplexing order. I wonder if there is a natural numbering system for basketball positions (center = #1, etc.), like there is in soccer. Even if there is such a natural numbering system, I still question the decision to confound that system with a complicated ranking of current-year playoff players against all-time players.
Above all, the question being asked is uninteresting, and so the chart is uninformative. A more interesting question to me is whether the best players are playing in this year's playoff. To answer this question, the designer should be comparing only currently active players, and showing the all-time ranks of those players who are playing in the playoffs versus those who aren't.
One of the dangers of "Big Data" is the temptation to get lost in the details. You become so absorbed in the peeling of the onion that you don't realize your tear glands have dried up.
Hans Rosling linked to a visualization of tobacco use around the world from Twitter (link to original). The setup is quite nice for exploration. I'd call this a "tool" rather than a visual.
Let's take a look at the concentric circles on the right.
I appreciate the designer's concept -- the typical visualization of this type of data is looking at relative rates, which obscures the fact that China and India have far and away the most smokers even if their rates are middling (24% and 13% respectively).
This circular chart is supposed to show the absolute distribution of smokers across so-called "super-regions" of the world.
Unfortunately, the designer decided to pile on additional details. The concentric circles present a geography lesson, in effect. For example, high-income super-region is composed of high-income North America, Western Europe, high-income Asia Pacific, etc. and then high-income North America is composed of USA, Canada, etc.
Notice something odd? The further out you go, the larger the circular segments but the smaller the amount of people they represent! There are more people in the super-region of high-income worldwide than in high-income North America and in turn, there are more people in the high-income North American region than in USA. But the size of the graphical elements is reversed.
In principle, the "bumps"-like chart used to show the evolution of tobacco prevalence in individual countries make for a nice visual. In fact, Rosling marvelled that the global rate of consumption has fallen in recent years.
However, I'm often irritated when the designer pays no attention to what not to show. There are probably well above 200 lines densely packed into this chart. It is almost for sure that over-plotting will cause some of these lines to literally never see the light of day. Try hovering over these lines and see for yourself.
The same chart with say 10 judiciously chosen lines (countries or regions) provides the reader with a lot more profit.
The discerning reader figures out that the best visual actually does not even show up on the dashboard. Go ahead, and click on the tab called "Data" on top of the page. You now see a presentation of each country's "data" by age group and by gender. This is where you can really come up with stories for what is going on in different countries.
For example, the British have really done extremly well in reducing tobacco use. Look at how steep the declines are across the board for British men (in most parts of the world, the prevalence of smoking is much higher among men than women.)
Bulgaria on the other hand shows a rather odd pattern. It is one of the few countries in the bumps chart that showed a climb in smoking rates, at least in the early 2000s. Here the data for men is broken down into age groups.
This chart exposes a weakness of the underlying data. The error bars indicate to us that what is being plotted is not actual data but modeled data. The error bars here are enormous. With the average at about 40% to 50% for many age groups, the confidence interval is also 40% wide. Further, note that there were only three or four observations (purple dots) and curves are being fitted to these three or four dots, plus extrapolation outside the window of observation. The end result is that the apparent uplift in smoking in the early 2000s is probably a figment of the modeler's imagination. You'd want to understand if there are changes in methodologies around that time.
As a responsible designer of data graphics, you should focus less on comprehensiveness and focus more on highlighting the good data. I'm a firm believer of "no data is better than bad data".
Business Insider (link) highlighted a map showing childhood food insecurity across the 50 states, with the data coming from a report by Brookings.
This is a nice map. I like the tones of the chosen colors although the colors are not intuitively matched to magnitude. (There is a small labeling issue in the New England section.) The message is very clear.
*** I wondered about the scale, in particular, the use of equal sized buckets to split the scale. As a designer, several key decisions here include the number of buckets, and the size of each bucket. The following chart shows the choice made by this designer:
In this chart, all the states are ranked by their food insecurity rates with the lowest on the left and the highest on the right. The three horizontal lines show where the current cutoff values are. They form two equal sized blocks because of the equal spacing chosen by the designer. There are a total of four buckets.
Now if you ignore the dashed lines, and focus on the solid line showing the increasing food insecurity rates, you'd notice that maybe there are only three buckets, not four. The following amended chart shows where I'd put the cutoff values resulting in three buckets. (18% and 23%).
With the new cutoff values, let's look at what the map looks like:
I'm pretty happy with this. It shows an even clearer picture. There are three clusters of states, most of the south and west suffer more than the north and east. The odd state here and there (e.g. Louisiana) turned out not to be so special.
But this version picks out the "outliers", the group that has the best food insecurity rates than the rest of the country (as shown on the left side of the line charts). These particularly well-performing states are North Dakota and Minnesota, New Hampshire and Mass. and Viriginia.
A small shift in the scaling cleans up the message!
Here is the same map with a progressive color scheme:
Reader omegatron came back with another shocking instance of a pie chart:
Here is the link to the AVERT organization in the U.K. that published the chart and several others.
For the umpteenth time, the pie chart plots proportions. All proportions are percentages but some percentages are not proportions. The data here would appear to be "rate of diagnosis" rather than proportion of diagnoses by age.
The data came from Table 3a of this CDC report (link), and they are clearly labelled "Rate". The footnote even disclosed that the "Rate" is measured per 100,000 people so they are being mislabeled as percentages.
Let's summarize. The percentages add up to much more than 100%, they are clearly not proportions, they are not even percentages, they are rates per 100,000.
omegatron even got confused by the colors. You'd think that the slices would be arranged by age group but no! The order of the slices is by size of the pie slices, with one exception--the lime green slice of 11.4%, which I cannot explain. In practice, this means the order goes from Under 13 to 13-14 to Over 65 to 60-64 to 50-54, etc.
A smarter use of color here would be to stick to one color while varying the tinge acccording to the rate of diagnosis. Using 13 colors for 13 age groups is distracting.
As a teacher, it's shocking that such pie charts continue to see the light of day. It's very disappointing, as I'd assume every teacher who teaches the pie chart will have pointed out the pitfalls. Why is this happening?
With this chart, I'm mostly baffled by the top corner of the Trifecta Checkup. What is the point of this data? If I understand the "per 100,000 population" definition, these rates are computed as the number of diagnosed divided by the population in each age group. So the diagnosis rate is a function of how many people in each age group are actually infected, and how effective is the diagnosis procedures, and whether that effectiveness varies with age. Plus, the completeness of reporting by age group (the footnote acknowledged that the mathematical model does not account for incomplete reporting. To call a spade a spade, that means the model assumes complete reporting.)
The rate of diagnosis can be low because the rate of infection is low or the proportion of the infected who gets diagnosed is low. I just can't conceive of a use of data that confound these factors.
A time series treatment would be interesting althought that addresses a different question.
Josh hated this "dataless visualization" from ABC. (link; warning: ads). Here are his comments:
The report has planes leaving China, landing across the globe and
instantly infecting us all with bird flu. It doesn't do a good job
explaining how and the rate pandemics actually spread. However, it does
do a good job scaring us all.
The entire flu pandemic theater is unscientific. It is based on the 100-year flood type of argument, with scientists claiming that we are "overdue" for some catastrophe. Reminds me of earthquake forecasting, covered by Nate Silver in his book. It is possible to predict the average frequency of, but virtually impossible to predict the timing of rare natural disasters.
The 100-year flood type calculations is based on averaging a small number of events over a very long time scale. There is no reason why these events should be spread out evenly over time (i.e. one event every 100 years).
This is a fallacy of "law of small numbers": if one throws a fair coin 10 times, one shouldn't expect exactly 5 heads, as the distribution of heads should look like the chart on the right. The chance of exactly 5 heads is only 25%.
Also, doctors keep me honest but I believe only one type of mutation, i.e. the one that makes the virus able to pass from human to human, has a chance of causing a pandemic. So it is wrong to say that "if the virus mutates," a pandemic will result. In addition, in the past, some viruses were able to pass from human to human but the rate of infection was not fast enough, and they failed to lead to a pandemic.
Daniel L. did not like the map shown below, from a research article on female mortality rate in the U.S., via Jezebel.
I was amused by what the blogger at Jezebel was able to take in from the map. Her post started with a huge version of the map, under which she said:
Mortality rates are rising in 43% of U.S. counties, as illustrated by this map from health researcher Bill Gardner.
Mortality rate is a statistic about the population. The map is an illustration of geographical area (distorted by the map projection). The map carries no information about population at all. Thus, it is not the right chart to display population data.
The statistic itself is poorly chosen. What does 43% of counties mean? Some counties have few people while others are very densely populated. New York County is barely visible on this map yet it has the heaviest weight on the average.
According to the CDC data, the death rate, age-adjusted, for women has been decreasing over time. So, the backward motion in those 43% of counties is somehow compensated for by forward progress in the other 57% of the counties, it appears.
Maybe the average for the whole country masks some local patterns. The cited map doesn't help because it assumes that the importance of the mortality rate is proportional to the geographical size of the county, when the right comparison should be the population of women in the county.
Reader James H. spotted this offensive pie chart in Forbes (link).
This chart tells us that emerging markets will be responsible for the greatest growth in medical spending up to 2016.
It is hard to find this message in the chart. The gray sector for Japan in 2006 reads 10%, the exact same number as the gray sector in 2016, which appears several times as large. In a pie chart, it is hard enough to compare the sectoral areas within a pie, let alone sectors of different-sized pies.
James noticed that the pie areas are incorrect. The 2016 pie should be roughly double the area of the 2006 pie. This is not the case. It seems like the radius of the 2016 pie is at least three times larger than that of the 2006 pie.
As usual, a line chart brings out the trend more clearly:
The projected numbers should be clearly labelled as such. "2016" should read "2016P". I'm not sure if the 2011 number was projected also - depends on when the data source was published.
The worst thing about this chart is it's completely misleading. It fails to recognize that there are many billions of people in emerging markets and "rest of the world" while U.S, Europe and Japan combined have just over one billion people. Thus, all this chart is really saying is that population growth in the next several years will mostly occur in emerging markets. One can substitute medical spending with any kind of mass market spending and have essentially the same picture.
Below are a rough estimate of the per-capita medical spending by region using population sizes in 2011. For emerging markets, I have substitued BRIC i.e. Brazil, Russia, India and China, which underestimates the population and thus overestimates the per-capita spend. These parts of the world spend a fraction of what industrialized countries are spending. So what's the story?