I very much enjoyed reading The Chronicle's article on "education deserts" in the U.S., defined as places where there are no public colleges within reach of potential students.
In particular, the data visualization deployed to illustrate the story is superb. For example, this map shows 1,500 colleges and their "catchment areas" defined as places within 60 minutes' drive.
It does a great job walking through the logic of the analysis (even if the logic may not totally convince - more below). The areas not within reach of these 1,500 colleges are labeled "deserts". They then take Census data and look at the adult population in those deserts:
This leads to an analysis of the racial composition of the people living in these "deserts". We now arrive at the only chart in the sequence that disappoints. It is a pair of pie charts:
The color scheme makes it hard to pair up the pie slices. The focus of the chart should be on the over or under representation of races in education deserts relative to the U.S. average. The challenge of this dataset is the coexistence of one large number, and many small numbers.
Here is one solution:
The Chronicle made a commendable effort to describe this social issue. But the analysis has a lot of built-in assumptions. Readers should look at the following list and see if you agree with the assumptions:
- Only public colleges are considered. This restriction requires the assumption that the private colleges pretty much serve the same areas as public colleges.
- Only non-competitive colleges are included. Precisely, the acceptance rate must be higher than 30 percent. The underlying assumption is that the "local students" won't be interested in selective colleges. It's not clear how the 30 percent threshold was decided.
- Colleges that are more than 60 minutes' driving distance away are considered unreachable. So the assumption is that "local students" are unwilling to drive more than 60 minutes to attend college. This raises a couple other questions: are we only looking at commuter colleges with no dormitories? Is the 60 minutes driving distance based on actual roads and traffic speeds, or some kind of simple model with stylized geometries and fixed speeds?
- The demographic analysis is based on all adults living in the Census "blocks" that are not within 60 minutes' drive of one of those colleges. But if we are calling them "education deserts" focusing on the availability of colleges, why consider all adults, and not just adults in the college age group? One further hidden assumption here is that the lack of colleges in those regions has not caused young generations to move to areas closer to colleges. I think a map of the age distribution in the "education deserts" will be quite telling.
- Not surprisingly, the areas classified as "education deserts" lag the rest of the nation on several key socio-economic metrics, like median income, and proportion living under the poverty line. This means those same areas could be labeled income deserts, or job deserts.
At the end of the piece, the author creates a "story time" moment. Story time is when you are served a bunch of data or analyses, and then when you are about to doze off, the analyst calls story time, and starts making conclusions that stray from the data just served!
Story time starts with the following sentence: "What would it take to make sure that distance doesn’t prevent students from obtaining a college degree? "
The analysis provided has nowhere shown that distance has prevented students from obtaining a college degree. We haven't seen anything that says that people living in the "education deserts" have fewer college degrees. We don't know that distance is the reason why people in those areas don't go to college (if true) - what about poverty? We don't know if 60 minutes is the hurdle that causes people not to go to college (if true).We know the number of adults living in those neighborhoods but not the number of potential students.
The data only showed two things: 1) which areas of the country are not within 60 minutes' driving of the subset of public colleges under consideration, 2) the number of adults living in those Census blocks.
So we have a case where the analysis is incomplete but the visualization of the analysis is superb. So in our Trifecta analysis, this chart poses a nice question and has nice graphics but the use of data can be improved. (Type QV)