Nothing is as simple as it seems
Rank confusion

A good question deserves good data


The last chart in the infographics on OECD education data asks another intriguing question: do countries that pay teachers more achieve better test scores?

Soshable_payperf

This chart suffers from the same ill as the one previously discussed (here): the data is not suitable to address the question. It is mighty hard to see any pattern in the set of bar charts on offer. This lack of correlation can be confirmed by displaying the data in a scatter plot:

Redo_payperf
The scatter on the left presents the data as shown in the original, with a regression line drawn in that appears to indicate a positive correlation of higher spending and higher achievement.

Here, spending is measured by the ratio of primary teacher pay after 15 years of service to average GDP while achievement is indicated by the proportion of students who attain a "top" level of proficiency in any or all of the three test subjects.

But notice the solitary point sitting on the top right corner (labelled "1"). That point is Korea, which has both the highest achievement and the highest spending (by far). Korea is an outlier (known as a leverage point). The chart on the right is the same as the one on the left with Korea removed. What appears to be a moderate positive correlation vanishes. (The numbers plotted are the ranking of countries by the proportion of students attaining top proficiency, the metric on the vertical axis.)

So, either the message is that achievement and spending are uncorrelated (for every country except Korea), or that we have a measurement problem. I think the latter is more likely, and would defer to psychometricians to say what are acceptable measures for spending and for achievement. Do primary teachers with 15 years or more of service represent "education spending"? Do top students adequately capture general achievement in the education system?

***

Soshable_payperf_closeup The original chart contains a serious misinterpretation of the data (source: Education at a Glance 2009, OECD). It falsely assumes that the proportion of students attaining top proficiency in each subject is additive. In fact, because the same student could be top in one or more subjects, the base of such a sum would not be 100%.

In my version, the metric used is the proportion of students who attain top proficiency in 1, 2 or all 3 subjects. This metric is computed off a 100% base.

I also removed the breakdown by gender. This creates clutter, and I can't find any interest in the male or female data.

 ***

See also our first post on this infographics.

 

Comments

Rahul

The fact that the integer labels on the left plot do not correlate with the labels on the right plot is very confusing. It would have been more natural to use country letter codes instead.

Besides that absolute integer rank is not very important and a relative rank is easy to discern based on the vertical axis.

Also, your metric "any or all of the three test subjects." seems to make no distinction between someone good in just one subject versus someone good in all three? Isn't some sort of additivity (however imperfect) important? An outcome with an all-round good student should be better than an outcome with a student good in just one subject?

Is GDP/capita a good measure to normalize by? I'd rather go for average income or PPP or some such.

Kaiser

Rahul: the charts were created to explore the data. it's not supposed to replace the original. If you are focusing on individual data points rather than the pattern of points, you're missing the point of the exercise. Replacing the original is besides the point when the data offered do not address the question at hand. Besides, you haven't defined what your metric of achievement is.

Chad

Kaiser,

Fantastic post. I have also become more concerned with the fundamentals underneath both the methods and assumptions BEHIND the data. We sometimes get so caught up in "whats the best way to visualize this?", that we forget the more important question: "is this data even valid?" / "are we asking the right questions?".

Of course real world data collection has innumerable constraints and I recognize this fully.

Again, nice post, good to take this view once in a while.

Chad

Rahul

Kaiser:

I think the best critique of the original infographic is if we can present a credible alternative infographic that does better at answering the question posed.

It is easier to provide piecewise refutations and improvements but capturing it all in one useful graphic doesn't follow easily.

e.g. the labeling difficulty in a scatter plot with clusters is not easily scrubbed away.

The comments to this entry are closed.