« Once more, superimposing time series creates silly theories | Main | Back to basics »



That's a fantastic technique. Does the trend have to be a LOESS or moving average or similar fit, or can it be a log or linear trend?

Which of the two variables supplies the trend, or do they each get a trend (each of which may be different)?


How do you know which part is the trend and which part is the noise?


derek: I de-trended each data series separately. This method like any other assumes a model: that there is residual correlation between X and Y that isn't related to time. If I'm doing this seriously, I'd look at different methods as well to corroborate the findings. It doesn't matter what method you use to estimate the trend.

anon: not sure what you are asking exactly. For this problem, the model is that there is correlation between X and Y and that we aren't interested in each series' correlation with time, so the correlation with time is noise.


I'm confused by what you're doing here. Aren't you a priori assuming that the trend is being driven by time and there's no causal link? If labour force participation is driving mileage how would this appear different from the two being time driven, assuming that time is a major factor in labour force participation. To put it another way, let us suppose we have a thing X that is dependent on time and a thing Y that depends on X so Y(t) = aX(t) + e where e is error or another thing that influences Y and a is the co-efficient linking Y(t) to X(t). Under your method wouldn't you simply strip out the connection between X and Y are conclude, erroneously, that X and Y are simply time driven and there's no real link between X and Y?


Jack, here's how I see what Kaiser's doing: you see mileage going broadly up with time and participation giong broadly up with time, and when you plot them against each other the correlation looks good. But that could be just due to the broad trend. If they weren't just effects of time, you would still see a correlation when you take the big broad trend out.

The fact that the correlation stops looking so good when you take the broad trend out suggests they weren't coupled to each other, so you can reject the hypothesis that the data shows they are (maybe you can't reject the hypothesis that they are, but you can't use this as evidence that they are)

I would suggest that you're not yet out of the woods if you still see a correlation after you've taken the secular trend out. Maybe they're being driven by a periodic cycle in time, such as an annual cycle. If you think that might be the case, you can try to remove the periodic time signal as well, and see if that makes the correlation go away.

I think this is related to what statistians do when they adjust participation to take out the annual cycle of employment (I can't remember the correct phrase). It's just that we're doing it to test whether two variables are correlated, by eliminating time, instead of doing it to test whether one variable is changing in secular time, by eliminating periodic time.


So, the logic here is:

-If two series are actually correlated, their residuals should be correlated.

-If only the two trends are correlated but not the residuals, we can assume that the series are both correlated to the same set of factors, but not to each other.

Is it?


@jack: "Aren't you a priori assuming that the trend is being driven by time and there's no causal link?"

Not at all.

In fact, the hypothesis being tested in this method is exactly the opposite: that there *is* a correlation independent of the aspect of time.

But most importantly, the analysis is done to find out what the relationship is, not to reinforce an assumption - if time was not the significant factor, you would see the correlation in the plot above.


While I applaud the sentiment here, it's worth noting that things can be more complicated than this, and that there are circumstance where lack of correlation between the detrended residuals can be misleading too. Although it's not likely to be relevant in this specific instance, looking just at detrended residuals can cause you to miss a relationship between non-stationary variables that are integrated of order one. (There's a decent intuitive explanation of what that means here.)


conchis: Thanks for the article. I agree that there is no simple rule that applies to all cases.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Marketing analytics and data visualization expert. Author and Speaker. Currently at Columbia. See my full bio.

Book Blog

Link to junkcharts

Graphics design by Amanda Lee

The Read

Good Books

Keep in Touch

follow me on Twitter