In this space was originally intended a post about seasonal adjustments to time-series data. That now has to wait. Because I am recovering from a bout of late-to-the-gate depression: you know the feeling, having arrived at the airport just in time, you half-run, half-walk to get to the gate, only to learn that the gate has just closed, and all the strain has been for naught.
I didn't miss a flight. I was knocked over by the Census Bureau, mentally exhausted. Any of you who have processed lots of data know this feeling. Just before you decide to publish your results (and thankfully, before and not after), you discovered that the data you analyzed contained such egregious errors as to be nonsensical.
So I present you the data on "new privately owned housing units started", or commonly known as "housing starts". The offending spreadsheet can be downloaded at the Census Bureau here. (Screen shot on left).
The file contains four sets of data: annual data, raw monthly numbers (not seasonally adjusted), seasonally adjusted monthly numbers, and the seasonal adjustment factors (which is just the ratio of the unadjusted to adjusted numbers).
The shocker: the "seasonally adjusted" series is 10 times as big as the "unadjusted" series. I kid you not. In October 2000, the raw data found 140,000 units of housing started; after adjustment, we magically had 1.5 million units started.
Since the seasonal adjustment factors were provided, I tried to reconcile the two sets of numbers. Perhaps a factor of 10 adjustment is enough. This caused more headaches.
According to the footnote, the factor is defined as "the ratio of unadjusted housing units started to the seasonally adjusted housing units started". For October 2000, this factor was given as 108, which I took to mean that the adjustment took the raw data down by about 8%.
But the digits wouldn't cooperate. Multiplying or dividing by 10 cannot resolve the fact that the seasonally adjusted "549" is larger than the unadjusted "397".
This is the unglamorous side of doing analytics and working with data. When I recover, I will write that post about seasonal adjustments.