## Nothing to see here

##### Mar 04, 2014

Some graphics are made to inform, some to amuse, some to delight. But the following scatter plot makes one wonder why why why...

What does the designer want to say?

***

I saw this chart inside an infographics titled "Where in the World are the Best Schools and the Happiest Kids?", via the Cool Infographics blog. The horizontal axis is happiness and the vertical axis is average test score.

So it appears that happy kids can get the best and the worst test scores, and kids with the best test scores can be both happy and sad.

That means the happiness of kids does not depend on their test scores.

But isn't this like an interaction plot? Isn't there a clear negative slope within each of the groups that have different marker colors? Or is that color just random? I agree though that it's a hideous chart.

You are missing the 3rd dimension of this chart - the location. Yes, kids can be happy and sad whilst doing well or badly in tests but it depends on which country they are in. The obvious message is that the best schools are located in the countries where the kids are happy and they get good scores (it seems Singapore is the place to be for schools!). When read in this way, it is actually very useful information.

We use these quadrant plots often to identify groups in industry analysis. It's not so much about predicting one with the other, but about seeing where all fall within both axes.

ssssss: isn't it the case that you can take any random scatter of points and use color markers to create "a clear negative slope"? That's the reason why I chopped off the axis labels.

Jim/Benjamin: Pay attention to your "model of the world" when you interpret the chart. Everyone who makes a statement about this chart has implicitly applied a model of the relationship between three variables, happiness, test scores and location. It's really useful to describe your own model and then to understand the error bars based on how many effects you're estimating and with what data. Let me know if I lost you here.

If the best schools are located in countries where the kids are happy and have high test scores, isn't that basically a tautology? If you measure the schools by their test scores, wouldn't that mean the best schools are located where the schools are the best and the kids are happy?

I have no idea what the colors are supposed to represent, except maybe

Why the hate? To me, this chart is really interesting for two reasons: first, I like to see which countries lay where, as it seems to tell a little about their culture. It certainly does about my country. Second, the fact that happiness and scores are not related is suprising and new to me, and well worth illustrating.

In other words, I don't care "what the designer wants to say", I care about interesting data presented in a reasonably readable way.

It's great to see this discussion. My challenge to those who see things in this chart is to reconcile your two observations:
a) there is no general pattern in the scatter, meaning that happiness is not correlated with test scores
b) interpret each country individually, in other words, form a hundred statements of the form happiness is {positively|negatively} correlated with test scores in {country}, each of these statements being an assertion of correlation.

Design aside, I commented on this infographic some weeks back, on the site where you found it. My gripe was with the data on which it is supposedly based. My comment was as as follows:

"Interesting, but perhaps misleading? The OECD's PISA testing results - you're right, upon which some of the data above is probably sourced - are perhaps more about a given country's students' ability to 'do the test' amongst other things, which leads to some interesting questions about general comparability. That aside, 'happiness' is very subjective, and likely to include cultural bias also. I'm Australian. Australian school kids are apparently more 'unhappy' than 'happy' at school, despite the relatively short hours, inclusion of sports programs and the friendly, 'easy going' nature of most schools. On the other hand, countries (and non-countries like Shanghai!) rate highly in happiness where students attend school for extra ordinarily long hours, more days of the week, and education is almost a life-or-death thing literally. Is that true happiness, or just a reflection of cultural difference? Either way, it's unlikely that any of the data on either axis is 100% comparable. As I say, an interesting exercise, nonetheless."

I will continue by saying that anyone looking at "Chinese" schools (read: selected cities in China) should check out this article by Brookings: http://www.brookings.edu/blogs/brown-center-chalkboard/posts/2013/12/11-shanghai-pisa-scores-wrong-loveless. The PISA results are nonsense beyond the China example as well; trust me, I'm a teacher! Then there is the 'happiness' index. There are plenty about (like this: http://www.noomii.com/blog/5639-happiness-income-by-country-map) but in the end it remains subjective and the whole exercise of trying to discern global-regional patterns and relationships isn't worth the effort, in my humble opinion.

Kaiser: I don't really agree with your second point. I'm not trying to find a correlation of happiness with test scores in individual countries, I try to find out how what I know about the countries' cultural differences and education systems affect both variables independently. Yes, these could be both in their separate chart and nothing of interest would be lost, except for the clear illustration that they don't seem to be correlated, which may be worthless if you've been expecting it; I wasn't, though.

Overall, I don't find the same flaws with this chart's execution as you do, maybe because I don't always require charts to be a vehicle for their author's message. In this case, I'm happy with it being just a clear enough visualisation of interesting data. However, Stephen makes good points about the reliability of the data, and on that front I believe the critique is well deserved.

Absolutely fair enough in your views Kaiser; we each bring our own eyes, perceptions and perspectives.

As an educator first, I am concentrating on the underlying premises of the data, which I find to be flawed. After that, to me, everything else - design/visual aspects and analysis - is based on nothing at all.

Perhaps I'm sensitive, because I have a similar issue at work (school) with those above insisting that their analyses of the data relating to student performance (which is infinitely more important than league tables and happiness measures) paints a certain picture, and reflects - often adversely - on teachers and their practices. The bottom line is that most often, these individuals have no idea about the underlying collection methods and treatment of the data, and worse, admit to no understanding of the numbers themselves (English teachers, most often!). In this case, then situation typically involves statistical moderation and manipulation to meet a pre-determined outcome as far as the governmental assessment authority is concerned. In fact, 'data' is the wrong word to use, because the numbers are an information product, designed solely to produce a certain outcome; in our case, a ranking for university entrance. The worst part is the follow-on admonitions from senior staff to 'do better' and 'work harder'.

In comparison, this example is certainly a 'frivolous' example when compared to students' results that affect their futures. That said, the OECD PISA testing results are used as a serious comparison between countries, and indirectly put pressure of teachers in these countries. Governments want to rank higher! This is a reality, and yet, the PISA data is a nonsense, and no valid comparisons should rightly be made between countries' academic standards based thereon.

To me, therefore, whatever picture the infographic is supposed to portray is effectively pointless. In this case the second variable - happiness - is even more ludicrous.

In looking at it, the visual message conveyed by the use of the point colour transition from pale blue to red (top right to bottom left diagonally) should suggest some type of grouping, be it cultural or regional, or whatever, as long as it's relevant to the countries. In this case, it's doing nothing; ie, highly academic, 'happy' countries are the same colour as low academic, 'happy' countries and even mildly 'unhappy' countries, regardless of culture, location or anything. This is a design aspect.

So I'll remain 'interested' to the point of analysing what's really going on, and critiquing everything, but I won't spend time attributing any real worth to this particular piece as an infogaphic from which conclusions can be made.

What I am truly happy to do, is spend time discussing the matter in a forum such as this, and hearing what other thoughtful and talented observers have to say. So in the end, sincere thanks Kaiser, for the opportunity to do so. No doubt this kind of discussion and debate is why you devote time to your blog; and it is also why this forum - and you - have a high degree of credibility.

Cheers

Stephen

There is one point that occurred to me that I should add, given the nature of my particular critique of this infographic. As someone who works in education (specifically geographical education) not just at a school level, but also in curriculum at the state and national level in my country, and with experience internationally, I am aware of background to the data that makes me aware of its limitations. Not everyone is in a position to do this.

The call that "the data upon which this infographic is based is deeply flawed/invalid, therefore compromising the information product" is not one we can often make because it takes a special, sometimes insider's understanding that we can't possess. In most cases, we are unaware of how data are collected and manipulated before they make it to public eyes and we simply take it, and any subsequent use of it, at face value.

With the increasing use and availability of multitudes of infographics and data visualisations on the 'net, I think we often start with the unavoidable assumption that the data are okay, we and proceed straight to the step of looking at the patterns, relationships, messages or the successfulness of design and representational techniques.

I realise that I looked at this infographic from a particular, invested point of view where others coming from other backgrounds will see it entirely differently.

Stephen: Click on the Trifecta Checkup link on the right column. You'll find that I like to decompose the visualization critique into three questions: data is one of them. But I also believe that you can look past the data quality and still comment on the design.

Kaiser: I agree totally, and have done so many times myself (comment on design alone). I am aware of your Trifecta Checkup and would suggest that people such as yourself are the exception in starting with the data. I'm just saying that "most people" out there in internet land start from the visual and skip the data part because that either takes some digging to find, or can't be found at all.

Stephen: It's true that readers rarely think about the data. It's also true what you said earlier that if the data is crap, the average reader should stop reading... and it's left to bloggers like me to look past the data. Since you read the blog, you also know I believe that the designer, rather than the reader, should be doing the hard work.

Plotted are not the scores of the countries, but the ranks. Which makes the chart even more useless.

The comments to this entry are closed.