The following throw-away lines in a Wall Street Journal article about the "return on investment" of getting into college debt (what an idea) are the most important ones:
The report [by the College Board] also doesn't account for dropouts or extra college years. Only 56% of students who enroll in a four-year college earn a bachelor's degree within six years, according to a report last year by the Harvard Graduate School of Education...
PayScale, a Seattle data firm, examines the links between pay and variables like colleges and majors. Its analysis, which also ignores dropouts but accounts for students who take longer to complete their degrees, ...
I cut that off since I've heard enough. How can they get away with ignoring dropouts when they are assessing the return on investment of college debt?
Imagine a cohort of 10,000 students starting college on debt. By year 6, which apparently is when they stop counting, 4,400 have not graduated, either because they dropped out or they are still in school. Both of these groups are likely to have the lowest return on investment of those in the cohort. Most of the dropouts won't be getting college-graduate jobs which pay higher. Those still in school are probably troubled students who if they do graduate later, would also earn less - even if they are equally qualified, they would earn less by value of time.
Given this reality, the analyses by the College Board and by PayScale would "ignore dropouts" as if they didn't ever exist. In other words, they only look at the 5,600 not the 10,000. This means whatever return on investment they compute will be exaggerated.
***
Technically, this is an example of survivorship bias. The sample being studied does not contain "non-survivors", in this case dropouts, so it doesn't generalize properly.
Also, the data is censored in the sense that the observation window is not enough for us to know what would happen to those people who are in college longer than 6 years. This is a common feature of such data sets; you'd want to do something about it, not just ignore it.
There are in fact many other problems with this type of analysis. Here's another crucial one: the counterfactual for reasoning whether debt is the cause of higher future wellbeing is not having debt. In other words, any such analysis must tell us what would happen if the same students were able to complete college without having to incur debt. Based on what the WSJ reporter said, I don't think this is how they framed the problem.


Hmm, I suppose this would be a problem if they were explicitly saying the ROI is for the entire population. But they added the clause that the ROI calc is just for the population that finishes school within the timeframe. Since the population is discrete, those data points can be segregated into separate populations. As they seem to have done here.
Still, it's a good point, and a common manipulation.
Posted by: Michael Thompson | 05/11/2012 at 07:39 PM
Michael: You're right that it's not the way they describe the result, it's the way they conducted the analysis. This falls into what I call "true lies". It's akin to a bank claiming that their loan portfolio has tremendous returns for those loans that did not default. It is a statistic that can be computed but it is useless, and misleading.
Posted by: Kaiser | 05/11/2012 at 09:47 PM
What makes this topic very interesting is also that it is an attempt to quantify the value of an education by reducing it to a monetary equation. if it were for trade schools, I think that it would be more relevant. But I think people are actually waking up to the fact that 1) schools are not guaranteed paths to income and hopefully 2) schools are as much about enlightenment and inquiry than about career choices.
The other buyer-beware statistical fudge is portfolio returns for portolios with funds that have closed. I'm not current on my disclosure compliance rules, but I recall that a fund manager's return didn't used to have to disclose returns for all funds, both current and discontinued. Not sure if that has been universally mandated yet.
Posted by: Michael Thompson | 05/12/2012 at 01:27 PM
PayScale actually did consider the issue of dropouts when determining the methodology for the College Return on Investment (ROI) report. While it's true that we didn't include wage data from dropouts directly, we did weight our ROI measure by overall graduation rates for each school, as provided by the Department of Education, to approximate dropouts in our calculation. Schools with lower overall graduation rates will take a bigger hit to their ROI than schools with higher overall graduation rates.
Posted by: Lydia Frank | 05/16/2012 at 03:31 PM
Lydia: Thanks for clarifying but I don't understand how you adjust ROI without wage data. Can you explain what you mean by "a bigger hit" to ROI? What is the nature of this "hit"?
Posted by: Kaiser | 05/22/2012 at 12:24 AM