In a previous post, we used some nifty EDA techniques to explore the relationship between Starbucks density and Obama victory margin. This post examines the intent of the original chart, and how we can use it to establish the same conclusion of not much correlation.
The dual-axis plot is a misnomer as in fact there lies a third scale we have not previously mentioned. This was the scale of the ranks of Obama victory margins sitting on the horizontal axis. Notice that the states were placed in a particular order such that the blue dots fell into an increasing pattern. Because of this, it is not sufficient for the green dots to show a weak linear trend.
At this point, lets reverse the axis and show the same chart but with the states ordered by increasing Starbucks density. This arrangement is more appropriate so as to explore if Starbucks density can explain the variation in Obama victory margins.
Here, we consider not just the linear regression line (dotted) but also the jagged line that joined each state to the next state. What is striking is the amount of gyrations from one state to the next. This shows many cases in which the Obama margin dropped markedly from a lower-denisty state to a higher-density state. This level of variability is seen along the entire horizontal range, and roughly indicated by the green arrow. Compare this to the blue arrow which is the range of estimated Obama margins across all states under the linear regression.
The fact that the green arrow dwarfs the blue one is indication that the correlation between winning margins and Starbucks densities is weak. Roughly speaking, it tells us that given a level of Starbucks density, the linear regression line explains only a small proportion of total variation, which is the same as saying R-squared is small.