I saw this ad by CUNY on the New York subway the other day:
One would think that this is a fact-based statement that can be verified by all parties but in reality, despite the many big words, the statement is imprecise. To truly understand it, we have to ask the writer to define many terms.
First, "6 times as many" is a number without a unit. Are they talking about 6 times as many students? Or is it that the rate of propulsion is 6 times higher? If the unit is student, then we need to know the relative student body sizes of CUNY campuses versus the eight Ivies combined.
It's clear that CUNY takes many more students from lower-than-middle class than the Ivies - there may not be that many students in the Ivy League schools whose families are lower-than-middle class. If they are referring to absolute number of students, then the Ivies will be no match just because they have a lower number of students from lower-income families.
Are they counting all students who start first year, or just students who graduate? Do they include graduate and/or post-graduate students?
Second, "middle class and beyond" also needs a definition. There is no standard for the middle class. Are they looking at individual or household income?
Third, timing matters a lot. What is the observation window? Presumably they are not measuring the graduates one month or one year after they graduate - but we don't know for sure.
Fourth, how accurate are the income estimates? Are the inaccuracies uniform across all these schools? Is the amount of missing data comparable across all these schools?
Fifth, one must ask if there are any shameless shenanigans such as data being removed because of "outliers" and so on. An example might be: we don't want to include a particular campus because it is too new, and would not be representative of future results.
***
What are some practical lessons from this?
If a data analyst is given this statement and free access to the database, he or she will have a hard time replicating the analysis that leads to the conclusion. There are simply too many unspoken definitions.
Bringing such a statement to a meeting does not bring certainty. It may provoke many questions. People in the meeting may not agree on these definitions. Changing definitions may change the conclusion.
There is no such thing as precise data.
In case you haven't seen the source, these figures appear to be from the Raj Chetty research on 'equality of opportunity' - they calculated mobility rates for colleges and universities. The CUNY system ranks among the highest nationally, at 7.2% ('the fraction of students who come from a family in the bottom fifth of the income distribution and end up in the top fifth of the income distribution'), which is probably about 6X the rate for the Ivies.
The data for it are here: http://www.equality-of-opportunity.org/college/
Posted by: Orlo | 07/20/2017 at 02:53 PM