## Be guided by the questions 2

##### Dec 18, 2010

In a prior post, I showed a chart of Pisa test scores that can be used to investigate differences between any pair of countries. At least one reader found it confusing, containing too much data. I then realize that if the objective of the chart is re-stated as "How the UK fared relative to other OECD countries", which was the intent of the original Guardian chart, the chart could be presented in the following simplified fashion:

Simplification can be achieved in many ways, one of which is simplifying the objective. In fact, I'd not be opposed to showing just the left side of the chart, which addresses an even more general question, which is how the countries fared in a general sense.

***

While the lines in the Guardian chart display correlations of math, reading and science scores within specific countries, essentially a parallel coordinates plot, the same correlation can be visualized in a scatterplot matrix (see this post).

Each scatter plot here relates the scores of two subject areas as indicated by the axis labels. The simplest observation is the high degree of positive correlation on all three panels: in other words, countries in general do well in all three subjects, or poorly in all three subjects.

This pattern confirms why it isn't very productive to focus readers' attention on this set of correlations when dealing with this data set.

You'll notice the use of colored dots on the scatter plots. Imagine that I have put the countries into groups based on overall scores (rather than just reading scores) as in my earlier analysis. The dots of the same color represent countries that are deemed to have performed similarly. The black cross indicates the "average country".

Focusing on the colors for the moment, you can confirm yet again that a country doing well in one subject is highly predictive of it doing well in the other subjects.

As I pointed out at the start of the prior post, using a little statistical technique allows us to understand the data better, and plotting summaries of the data allows us to draw more interesting conclusions than putting all the data, unperturbed, onto a canvass.

in the previous plot you 'promised' to explain how you grouped the countries. So how did you group them?

I would redo Redo_Pisa3 (the first pair of graphics at the top) so that:

a) the scale "UK differential from other OECD countries" was actually "other OECD countries differential from UK", so that the better scores are not confusingly negative

b) the scale graph is turned on its side to be vertical

c) the table at left is placed on the right of the scale graphic, with leader lines between the positions on the scale and the country labels

Now it's not two graphics, but one. I might also turn the triangles into solid lines between the highest and lowest score in the group, keeping the blurring that simplifies the graph, but giving an idea of the size of the in-group range and the gaps between groups.

As I suggested in my comment to the previous post, I would have plotted the Redo_pisa3 plot so better is to the right, not to the left. Derek suggested changing the labels, but this would have kept better performance to the left, which is still confusing.

I definitely like the scatter plots best. I was kind of surprised before that some countries (like mine, Sweden) jumped so many steps between the three different ranks. But the scatterplots are really tight, so the jumps can't be that bad...

I posted the below comment in the wrong thread. Sorry.

I believe it would be interesting to caculate a multi-trait multi-method matrix with the three different tests. See how the correlation between each subject relates to one another.