« Statistically improbable words 5 | Main | Being locked up in a room with nothing but a bed and food »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Kevin Henry

You claim that resolving mistakes introduces bias; but this requires an assumption about the underlying distribution of errors. Yes, if positive and negative errors are equally common then only correcting negative errors will introduce bias. But if negative errors are more common than positive errors, correcting negative errors will tend to reduce bias.

So I don't think you can support your conclusion without giving evidence about that underlying distribution.


Kevin: When there is a mix of positive and negative errors, they tend to cancel each other out. Correcting only the negative errors will expose the positive errors, and thus increase bias on average.

In my book - as other credit scoring books - I look at the types of algorithms used to make such predictions. With an understanding of how it works, I think it's even more difficult to believe that these algorithms only make errors in one direction.


A point as well is that what the credit agencies are doing is similar to what happens when analysing large administrative datasets when exact identifying information is removed, and we are left with things like date of birth and city. It then becomes a case of deciding how close do the identifiers have to be to considered the same subject. Choose wrongly and the analysis becomes pointless either because we miss a lot of matches or we have lots of spurious matches.


If credit scores become less able to distinguish good and bad customers then that is a good thing. Credit scoring amplifies existing inequalities by tending to reward people who are already well off whilst penalising those who are not.


Jack: The evidence is strong that credit scoring technology has vastly expanded the number of Americans who are able to get credit. Credit is not a right, it's a responsibility. Expanded credit has absolutely increased the average quality of life in America.

That credit scoring is not perfect is not a reason to reject the technology. One must compare it against the alternative. Before credit scoring was human scoring. Does human scoring not "reward people who are already well off"? I cover all this and more in Chapter 2.


'It's ok for the credit reports to have errors so long as they are unpredictable.'

This is true in the aggregate; it sucks for the hapless individual with negative errors; its a boon for the person with positive errors.

Credit Repair

The dispute process introduces a bias so that the reports tend to contain mistakes that cause overestimation but not those that cause underestimation.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)

Marketing and advertising analytics expert. Author and Speaker. Currently at Columbia. See my full bio.

Spring 2015 Courses (New York)

Jan 26: Business Analytics & Data Visualization (14 weeks) Info

Feb 23: Statistics for Management (10 weeks) Info

Mar 28: Careers in Business Analytics & Data Science (one-day seminar) Register

Apr 7: The Art of Data Visualization Workshop (6 weeks) Register

Next Events

Sep: 28 Data Visualization New York Meetup, New York, NY

Oct: 5 Andrew Gelman’s Statistical Communications class, Columbia University

Oct: 13 AQR ProSeminar, NYU Sociology

Oct: 22 Leading Business Change Through Analytics, Columbia Business School

Oct: 30 Ray Vella’s Designing Infographics class, NYU

Past Events

See here

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee


  • only in Big Data