« September 2012 | Main | November 2012 »

Fifteen points, add confusion

Bloomberg Markets (November) has this chart showing national debt as a percentage of GDP for selected countries.


If you count the points, there are only 15 on the chart, three annual numbers for each of five countries.

The theme appears to be redundancy. Countries are identified by both their names and their flags, one placed vertically and one placed horizontally. The 2011 debt-to-GDP ratios are displayed twice, once as the rightmost points on those lines, and once as data labels on the far right. The vertical axis plus gridlines too provide information that is quite unnecessary when there are labels on the right.

The five dots soak up much of our attention but there is no data in them. In particular, the Greek and U.S. dots fall in between two years where the straight lines are just convenience.

If all the lines were given the same color, then it would be straightforward to highlight the U.S. data by giving it a different color. The switch to an area chart for U.S. data is both ugly and distorting. The distortion is due to the vertical axis not starting at zero -- this is acceptable for line charts but certainly not for area charts.


I find that the data themselves present a challenge for interpretation. Based on my reading of economic news, Greece, Spain and Italy are current poster children for countries with debt problems while Japan has had a debt problem but is more associated with a stagnation problem more recently.

The issue is the comparison of each of these countries to the U.S. brings almost zero insight. Italy, for example, probably has some kind of legislative control that puts a lid on debt at 120 percent of GDP. Japan, while not currently in the conversation about countries in danger of default, has much worse debt-to-GDP ratio than Greece, which is in serious trouble. By this measure, Spain is nowhere near as indebted as the U.S. and yet it is in much trouble.

In the following version, I plotted the relative change in the metric with 2009 set to 100.


Purple politicial speech

I enjoy looking at the New York Times' summation of National Convention speeches via visualization. (link)Nyt_conventionwords

It's a disguised word cloud combined with a bubble chart with a little bar chart thrown in for good measure.

The size of the bubble is the total number of mentions of particular words or phrases. So the bubbles tell us the importance of specific concepts in aggregate of two parties.

It's the split within each bubble that represents the relative emphasis by party. Helpfully, the bubbles are sorted from left to right with the most Democratic words on the left. This splitting uses a bar chart paradigm. The diameter of the bubble is being partitioned, not the areas of the segments.


I wanted to see this as a straight-out word cloud. In the following, I use the red-blue-purple color gradient to indicate the Republican-Democratic bias, and the size of the words to indicate the number of mentions.


This word cloud is created using the Wordle tool, advanced options. My colleague John helped me pick the colors. (By the way, I don't like the insertion of small words within large letters, like what happened here inside the O in Obama.)

Also, I'd line the colors up so that the red words are on one side, blue on the other and purple in the middle. I'd need a different tool to be able to exercise this type of control.

What's wrong with this food picture?

Here's a chart in the November edition of Bloomberg Markets:


Curiosities include: how they split up the lamb chop, why an onion is chosen to represent "fresh vegetables/melons"?

The chart contains some strange data that make readers feel nervous. For example, the fish image seems to say 88 percent of seafood eaten in the States are imported, and yet the two largest importing countries listed below (China and Vietnam) together account for only 22.5 percent. So the residual 65.5 percent must be split among at least 10 countries each accounting for not more than 6.5 percent of the total.

Then when you look at vegetables, Mexico and Canada together supply 72 percent. But the onion graphic tells us it's less than 20 percent. The categorization seems to be different between the top and the bottom layers. We have "fruit and nuts" / "fresh vegetables/melons" on the one side, and "fruit" / "vegetables" on the other side.

And why are melons combined with fresh vegetables rather than fruit?

Can information be beautiful when information doesn't exist?

Reader Steve S. sent in this article that displays nominations for the "Information is Beautiful" award (link). I see "beauty" in many of these charts but no "information". Several of these charts have appeared on our blog before.

Junkcharts_trifecta_checkupLet's use the Trifecta checkup on these charts. (More about the Trifecta checkup here.)


Info_beaut_plot_linesThe topic of this chart is both tangible and interesting. As someone who loves books, I do want to know what genres of books typically win awards.

However, both the data collection and graphical design make no sense.

The data collection problem presents a huge challenge and it's easy to get wrong. The problem is how narrow should a theme be. If it's too narrow, you can imagine every book has its own set of themes. If it's too wide, each theme maps to lots of books. The challenge is how to select the themes such that they have similar "widths". For example, "death" is a very wide theme and lots of books contain it, as indicated by the black lines. "Nanny trust issues" is a very narrow theme, and only one of those books deals with this theme. When there is such a theme, is its lack of popularity due to its narrow definition or due to writers not being interested in it?


Info_beaut_coversThe caption of this chart said "Cover stars: Charting 50 years up until 2010, this graphic shows The Beatles to be the most covered act in living memory." If that is the message, a much simpler chart would work a lot better.

Since the height of the chart indicates the number of covers sold in that year, the real information being shown is the boom and bust cycles of the worldwide economy. So, a lot more records were sold in 2005, and then the market tanked in 2008, for example.

That's why the data analyst should think twice before plotting raw data. Most data like these should be adjusted. In this case, you could either compare artists against one another in each year (by using proportions) or you have to do a seasonal and trend adjustment. I also don't see the point of highlighting year-to-year fluctuations. Nor do I understand why only in certain years is the top-rated cover identified by name and laurel wreath.



I talked about this stream graph of 311 calls back in 2010. See the post here.



I featured this set of infographics/pie charts back in 2011. See the post here.



This chart is a variant of the one from New York Times that I discussed here. I like the proper orientation on the NYT's version. The color scheme here may be slightly more attractive.



Why my love affair with Apple is about to end

I admit it. I have been an Apple fanboy for as long as I can remember. My first computer was a Mac, back in the days when screens were black and white and the smiling Mac icon stared you in the face. When I just graduated from college, and really shouldn't be splurging on expensive computers, I made do with a Mac clone (remember those?). In recent years, I have had three functioning Apple computers at the same time, plus ipods, iPhones, etc. While the prices are high, I have always appreciated the quality and the customer service.

Unfortunately, those two issues are now making me rethink my relationship with Apple. The situation is still developing. The short version is I'm trying to locate my old hard drive that was replaced during repairs (done three days ago and counting.) Anyone knows what I should do to make this happen, please let me know! If I don't get my drive back, a chunk of the history of Junk Charts will be gone, as I have only backed up portions of it... the sketches, notes, data sets, etc. that came before the posts here will be gone, forever.


The saga began with my MacBook Pro not booting up. The laptop serviced me well for five years. I never once had any repairs, except for buying new batteries. Last weekend, the screen went black but it still partially booted. This itself was curious as my prior iBook also had no issues for five years and then died. Given that Apple officially considers models as obsolete after five years (or did the "genius" say six years?), it is a little suspicious the timing of when these laptops got heart attacks. It's only a sample size of two. In any case, I wish the customer service rep could pull up a screen and see my collection of Apple computers and realize he didn't need to sell me a new laptop -- I already bought a new one a couple of years ago.

The "genius" did what he was supposed to do. Explain the legalese to wash Apple's hands after they wipe out all of your data. My choice was spending hundreds of dollars and time to extract a few months' of work that weren't backed up, or take a small chance that the hard drive would be replaced. The "genius" and I both heard the spinning of the hard drive, and felt it was unlikely to need fixing.

Of course, when the computer got sent back (after only 3 days!) to me, I saw the ominous note saying "the following parts were replaced: logic board, hard drive". The "symptom" that led to replacing the hard drive was listed as "hard drive not recognized/mount". That indicated to me the hard drive itself was fine, and maybe with luck I could get it back, plug it into a different device, and retrieve the lost data.


The day after I received the shipment, I called the number listed on the repair report. They told me to call the Soho store manager as that's where my request would be handled. I called them, they searched around a bit, and told me that the repair work was done elsewhere (Houston, TX, according to the report), and so I was asked to leave my phone number for a store manager to call me back.

Since time is of the essence, I just showed up at the store the next morning. In the intervening hours, I didn't get a call back. The people at the store were nice, and told me they had put in a request, and in a few days, I could call and check the result. I felt reasonably happy.


Oddly, within five minutes of walking out of the store, I got a call from another person from the store, who said she was calling about the message I left the previous day. She insisted on talking to me about the case even though I told her it's been taken care of by the people I just spoke to.

Then came a conversation that I'd remember in the future as the moment of my breakup with Apple. In many ways, it's typical customer service of most American companies today but I hold Apple to higher standards, since I'm a fanboy, I know they (used to) have better service, and I paid a lot of money for my computers.

Here are some highlights:

  • She claimed that her colleagues were completely wrong. According to some "notes" (which she later claimed came from an eCRM system), the technicians erased my hard drive, and therefore there was nothing they could do about my situation.
  • When I told her the repair report specifically said "the following parts were replaced", she said she wouldn't believe it. She wanted me to walk the sheet of paper over to her at the store to prove it.
  • She refused to put any of her various comments in writing.
  • She told me no one ever get hard drives retrieved whether or not they were erased or replaced. (Amusingly, no other Apple employee whom I spoke to during this saga mentioned this pertinent "fact".) She back-tracked when I told her I knew people who got their drives back.
  • I asked her then how she could figure out what happened just by reading "notes" without talking to anyone. She started reading the note to me. I did not hear the words "erase the hard drive". It just said "OS was clean installed"; you would have to do that if a new drive was plugged in anyway.
  • I asked her to forward the "notes" she's reading, she said she couldn't. She said she would give me her name. I asked how that was going to help if she later denied telling me any of the above.
  • I asked her if someone at the store has spoken to the person who did the repair work, she said they were not allowed to.

It didn't take her long to pull out the "terms and conditions" scam. Oh, the store warned us that there would be a chance the data would be wiped out. I explained to her that the chance was low and that's why I went ahead. In addition, the "genius" discussed replacing the hard drive, not erasing and writing over the old data. She lectured me on how I should never take any risk, even if it's a 1 percent chance. I asked her if she'd walk out of her home because there is a small chance you could get hit by a car. She said that was irrelevant.

Now, she was threatening to hang up on me. This was because I disrespected her. How did I disrespect her? I described the "terms and conditions" as "legal bullshit". She said the word was unacceptable, and she threatened to hang up again.

All this time, I don't understand why she would not let the process run its course. The other guy has already submitted a request. She called back one other time, again to convince me that the data is wiped out, and wanted to put an engineer on the phone to explain the reasoning they used to infer that.

She offered to waive the fees for my repair. But she completely misread the situation. I had even offered to pay for shipping to get my old drive back, in addition to paying for the repair work.

She now asserted that everyone else who have looked at this case was wrong. The repair report was wrong. Her interpretation of the "notes" was right. She apologized for all the other people who got it wrong.


I'll learn in a few days if my data will be forever lost. It's funny how it is: if this kind of thing happens, it erodes your relationship with the brand. When in the past I convinced myself that I'm paying for quality and better customer service, the next time I'm buying a new computer, my evaluation of Apple would have suffered in those respects.

When simple is too simple

30coontz-gr1-popup-v2Thanks to reader Don M, I came across this fascinating chart published in the New York Times Review recently (link). The main article, about gender segregation in job categories, is found here.

This is one of those charts that require a reader's guide.

The chart shows the proportion of women in each job category in year 1980 and in year 2010 (and nothing in between). The jobs are divided into three large chunks: the top chunk (shaded) consists of jobs in which women account for more than 70 percent of the total; the middle chunk (white background) are those jobs with 30 to 70 percent women; the bottom chunk (also shaded) are jobs with more than 70 percent men.

The designer then uses the red, green and gray colors (apologies to the color-blind folks) to group the jobs into three clusters. This is usually a great idea except that it is poorly executed here. Don is very annoyed with this because these colors lead the readers to the wrong conclusion, and I agree.


The color scheme is unnecessarily convoluted. Here is an alternative I prefer:

  • if the change is 5 percent or less, color as gray no matter where the line is. (It is insane to color the line for housekeepers "red" for going from 87 to 89 percent in 30 years). 
  • if the change is over 5 percent in the female direction, color it red to indicate the occupation is becoming more female. (There would be many red lines, such as for managers in education, HR staff, social workers, architects, etc.)
  • if the change is over 5 percent in the male direction, color it blue to indicate the occupation is becoming more male (There would be only one blue line, and that is for welfare service aides.)

This would mean the lines for dentists and architects would be labelled progress. So too with most of the jobs that were predominantly male in 1980. In fact, there really isn't any occupation that went backwards--all those red lines in the bottom shaded chunk indicate shifts of only 1 to 4 percent, over 30 years!

This conclusion usurps the premise of the column in which the author claims that the conventional wisdom is wrong.


The other precaution in reading this chart is to realize that each occupation is put on equal footing in this chart even though some job categories employ a lot more people than others. Also confounded with this data is the differential growth/decline in job categories over the 30-year period. Further, the proportion of women entering the labor force must be accounted for.

This is a case in which less is less. The structure of the problem is complex, and it requires a more sophisticated approach.

Expanding circles of error

Reader James H. spotted this offensive pie chart in Forbes (link).


This chart tells us that emerging markets will be responsible for the greatest growth in medical spending up to 2016.

It is hard to find this message in the chart. The gray sector for Japan in 2006 reads 10%, the exact same number as the gray sector in 2016, which appears several times as large. In a pie chart, it is hard enough to compare the sectoral areas within a pie, let alone sectors of different-sized pies.

James noticed that the pie areas are incorrect. The 2016 pie should be roughly double the area of the 2006 pie. This is not the case. It seems like the radius of the 2016 pie is at least three times larger than that of the 2006 pie.


As usual, a line chart brings out the trend more clearly:



The projected numbers should be clearly labelled as such. "2016" should read "2016P". I'm not sure if the 2011 number was projected also - depends on when the data source was published.

The worst thing about this chart is it's completely misleading. It fails to recognize that there are many billions of people in emerging markets and "rest of the world" while U.S, Europe and Japan combined have just over one billion people. Thus, all this chart is really saying is that population growth in the next several years will mostly occur in emerging markets. One can substitute medical spending with any kind of mass market spending and have essentially the same picture.

Below are a rough estimate of the per-capita medical spending by region using population sizes in 2011. For emerging markets, I have substitued BRIC i.e. Brazil, Russia, India and China, which underestimates the population and thus overestimates the per-capita spend. These parts of the world spend a fraction of what industrialized countries are spending. So what's the story?