The context is, yet again, climate data and its analysis. In my prior post on this topic, I remarked that "For me, the real climategate is the woeful state of statistical education." Specifically, I felt that scientists should feel confident enough to say that some statistical adjustments of data (which Phil quoting Richard Lindzen calls "corrections") are legitimate, and definitely not shameful, something to hide. In his post, Phil basically came out to defend two such adjustments. Bravo for him.
***
Richard Lindzen's accusation boils down to a perverse incentive. Climate scientists (of the mainstream school) are so dogmatic about their beliefs that they would do anything to prove their models correct. The data processing stage of any statistical analysis is exploratory -- the analysts are looking for things that don't look right, the (in)famous "outliers", and so on. Given the strong beliefs in their models, these analysts would only fix errors that bring the data closer to the models while they would have no incentive to fix errors that bring the data further from the models.
This is an absolutely acute observation -- but the incentive applies to analysts on both sides of the climate debate, as well as every analyst out there doing statistics!
I discuss a great example of this in Chapter 2 of Numbers Rule Your World ("Bagged Spinach / Bad Score"). Consumers are judged on creditworthiness by credit scoring models. Consumer protection advocates love to demand that consumers be allowed to inspect their credit reports, and correct any mistakes.
If I find an error that causes an artificially low score (say, claiming I missed payment when I have not), you can be sure I will fix that error. If I find an error that causes an artificially high score (say, not noting that I missed payment when I have been delinquent), are you going to volunteer this information to the credit bureau?
This may sound innocuous but the people who are most hurt are those of us with better creditworthiness. In the long run, the lower scores will be moved towards the average because any negative errors are corrected diligently but those with high scores have less incentive to inspect their data. The dispersion of scores becomes smaller, and it becomes harder for the banks to separate the good risks from the bad risks. Be careful what you wish for!
(This is further exacerbated by credit-repair scams, which I will refer you to the book for more discussion.)
Also, since this is also a question of ethics, read Andrew's remarks on ethics, and comments on that post.
Your claim about credit report self correction reducing the spread in credit scores assumes that the probability of a person requesting and correcting their report is independent of their score. I would not be surprise, however, to find that in fact people with higher credit scores are much more likely to monitor and correct their reports than those with lower scores. If so, self correction could actually increase the spread of scores. The rich get richer, so to speak.
Posted by: Roban Hultman Kramer | 03/30/2010 at 07:34 AM
Roban: No I did not assume independence. I said "those with high scores have less incentive to inspect their data". Typically, you apply for a credit card or a loan and get rejected, then the loan provider tells you it's because of your credit score, and then you inspect the credit report. If you have a 780 credit score, there is very little reason to spend the energy to make it 790. But if you are at 580, that's a different story.
Posted by: Kaiser | 03/30/2010 at 09:18 PM