« Horrid stuff | Main | The sum and the parts »

Feb 13, 2007

Horrid stuff 2

Jp_horridstuff Jon P took my comment on negative correlation and explored it furtherGiven the large ranges of values cited in the original Economist chart, Jon concluded that there wasn't enough evidence to make a judgement.

I agree to a large extent.  Apart from the high variability of individual measurements, we also face the tiny sample of 5 cities. 
In his chart, he made an implicit assumption that the correlation of two factors is related to the product of the ranges (variability) of each factor by plotting the rectangles.

A different way of looking at it is to plot only the mid-range values (i.e. ignoring the within-city variability).  The graph on the left hand side shows very little pattern.

Resorting to the formula, I found that the correlation = -0.03.  So barely detectable negative correlation.  Lets visualize this. 

Redo_pollutant2 On the right graph, I added the mean lines for both variables.  This divides the graph into four quadrants; dots that fall into the lower right and upper left quadrants make the correlation value negative.  There were three of those versus two in the positive quadrants; hence, the tiny negative correlation. 



TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341e992c53ef00d83518f97f69e2

Listed below are links to weblogs that reference Horrid stuff 2:

Comments

-0.03? So R-squared is much less than 1%. Pretty tenuous. Also, you have three points in favor of a downward slope and two in favor of upwards, but one of those downward points would be touching the quadrant divider if the marker were a little larger. You could almost draw a circle connecting all the points.

We really need more cities, and more paired measurements, to say anything meaningful about the relationship.

I took the analysis sideways, and replotted my chart based on the assumption that there was some correlation (essentially plotting the diagonals of my boxes, rather than the centroids Kaiser plotted):

http://peltiertech.com/Excel/Commentary/HorridStuff.html

Why not? It's snowing and sleeting here (central Massachusetts), school is canceled and nobody's going anywhere, and it takes only two seconds to copy a sheet and delete a few rows.

Food for thought.

My monkey brain insists on seeing a pattern to the rise and fall, either two straight lines or a parabola. But with only five points of dubious accuracy, there's probably nothing really there.

I think we can all agree on the conclusion that five samples are not enough to make reliable inference.

My graph was intended to illustrate a neat way to visualize the concept of correlation. Nothing more than that!

I'm not having a go at anybody. I'm sorry if you feel like you're being piled on.

Derek, don't worry. I wrote that comment just to make sure that the point of the posting was not lost.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Mentions


  • My Amazon.com Wish List

  • Yahoo! Picks

Search Junk Charts


  • Custom Search

Residues

July 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31