« Horrid stuff | Main | The sum and the parts »

Feb 13, 2007

Horrid stuff 2

Jp_horridstuff Jon P took my comment on negative correlation and explored it furtherGiven the large ranges of values cited in the original Economist chart, Jon concluded that there wasn't enough evidence to make a judgement.

I agree to a large extent.  Apart from the high variability of individual measurements, we also face the tiny sample of 5 cities. 
In his chart, he made an implicit assumption that the correlation of two factors is related to the product of the ranges (variability) of each factor by plotting the rectangles.

A different way of looking at it is to plot only the mid-range values (i.e. ignoring the within-city variability).  The graph on the left hand side shows very little pattern.

Resorting to the formula, I found that the correlation = -0.03.  So barely detectable negative correlation.  Lets visualize this. 

Redo_pollutant2 On the right graph, I added the mean lines for both variables.  This divides the graph into four quadrants; dots that fall into the lower right and upper left quadrants make the correlation value negative.  There were three of those versus two in the positive quadrants; hence, the tiny negative correlation. 



TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/452792/16115648

Listed below are links to weblogs that reference Horrid stuff 2:

Comments

-0.03? So R-squared is much less than 1%. Pretty tenuous. Also, you have three points in favor of a downward slope and two in favor of upwards, but one of those downward points would be touching the quadrant divider if the marker were a little larger. You could almost draw a circle connecting all the points.

We really need more cities, and more paired measurements, to say anything meaningful about the relationship.

I took the analysis sideways, and replotted my chart based on the assumption that there was some correlation (essentially plotting the diagonals of my boxes, rather than the centroids Kaiser plotted):

http://peltiertech.com/Excel/Commentary/HorridStuff.html

Why not? It's snowing and sleeting here (central Massachusetts), school is canceled and nobody's going anywhere, and it takes only two seconds to copy a sheet and delete a few rows.

Food for thought.

My monkey brain insists on seeing a pattern to the rise and fall, either two straight lines or a parabola. But with only five points of dubious accuracy, there's probably nothing really there.

I think we can all agree on the conclusion that five samples are not enough to make reliable inference.

My graph was intended to illustrate a neat way to visualize the concept of correlation. Nothing more than that!

I'm not having a go at anybody. I'm sorry if you feel like you're being piled on.

Derek, don't worry. I wrote that comment just to make sure that the point of the posting was not lost.

Post a comment

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31