« Statistically improbable words 5 | Main | Being locked up in a room with nothing but a bed and food »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Kevin Henry

You claim that resolving mistakes introduces bias; but this requires an assumption about the underlying distribution of errors. Yes, if positive and negative errors are equally common then only correcting negative errors will introduce bias. But if negative errors are more common than positive errors, correcting negative errors will tend to reduce bias.

So I don't think you can support your conclusion without giving evidence about that underlying distribution.


Kevin: When there is a mix of positive and negative errors, they tend to cancel each other out. Correcting only the negative errors will expose the positive errors, and thus increase bias on average.

In my book - as other credit scoring books - I look at the types of algorithms used to make such predictions. With an understanding of how it works, I think it's even more difficult to believe that these algorithms only make errors in one direction.


A point as well is that what the credit agencies are doing is similar to what happens when analysing large administrative datasets when exact identifying information is removed, and we are left with things like date of birth and city. It then becomes a case of deciding how close do the identifiers have to be to considered the same subject. Choose wrongly and the analysis becomes pointless either because we miss a lot of matches or we have lots of spurious matches.


If credit scores become less able to distinguish good and bad customers then that is a good thing. Credit scoring amplifies existing inequalities by tending to reward people who are already well off whilst penalising those who are not.


Jack: The evidence is strong that credit scoring technology has vastly expanded the number of Americans who are able to get credit. Credit is not a right, it's a responsibility. Expanded credit has absolutely increased the average quality of life in America.

That credit scoring is not perfect is not a reason to reject the technology. One must compare it against the alternative. Before credit scoring was human scoring. Does human scoring not "reward people who are already well off"? I cover all this and more in Chapter 2.


'It's ok for the credit reports to have errors so long as they are unpredictable.'

This is true in the aggregate; it sucks for the hapless individual with negative errors; its a boon for the person with positive errors.

Credit Repair

The dispute process introduces a bias so that the reports tend to contain mistakes that cause overestimation but not those that cause underestimation.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep