« My pre-existing United boycott, and some musing on randomness and fairness | Main | Data sleaze: Uber and beyond »


Feed You can follow this conversation by subscribing to the comment feed for this post.


"It seems likely that the difference between the 5th and 7th grader is not statistically significant - which, if true, is a sad commentary on our education standards."

Is it? I think it doesn't necessarily mean that kids aren't learning between fifth and seventh grades; it could just be that variation among individuals of a certain age is really wide--fifth graders don't all cluster tightly around "grade level" with a few gifted/delayed outliers in each direction. I was surprised by the three-standard-deviation comment, but when I thought about my schooling experience I realized that it made a lot of sense.


Hi Jeff: I'd think the variability among 5th graders should be similar to the variability among 7th graders but because of selection effects, that may be false. However, I still stand by the original statement - remember that the 7th grader here is taking the 5th grade test, not a 7th grade test. Also, in that last sentence, you mean three-grade-levels not three-standard-deviations; the researchers say one SD is roughly three grade levels.


Thanks--yes, I understand the distinction; I just misphrased it.

I suppose I agree with you about the state of our educational standards if you are critiquing the way we are defining and measuring success. The expectation that everyone will be in the same place at the same age is unrealistic. What people usually mean when they say something like this, though, is that our expectations aren't high enough, or that the instruction isn't sufficiently rigorous, or some such. This may also be true, but is a wide range in ability relative to the pace of the curriculum evidence of it? Given all the factors in and outside of school that can affect student performance, it's not obvious to me that it is.


Jeff: I am also an educator so this is self-critique. One factor not discussed here is the measurement instrument... I did not investigate what tests are used to calibrate the grade level equivalent scale.

I do think that on the first order, we want the average student in 7th grade to do better than the average 5th grader on a 5th grade test. In theory, a 5th grader who progresses to the 7th grade should be accumulating knowledge, so the entire score distribution should be moving up (variability may change though as some individuals may progress faster). But as you pointed out, the system is complex, and we are comparing different cohorts, so the reality may not be as clearcut.

I am mostly talking about the "average" student which doesn't exist but is a good aggregate measure. This is distinct from variability, which you are very concerned about.


Minor nitpick: if you refer to "item 2" or "item 3" in a bulleted list, it is easier to see those items if the list is numbered rather than simply using dots...

PS: I hate to be "that guy".


D: I fixed it, just for you :)


I think drawing an indifference curve is a terrible suggestion, because commute length data is garbage. It is not a proxy for distance to New York; one can just as well put Topeka on that graph; it would be a green dot, indicating a short commute.

Better to bin the whole thing!

Luke Smith

Some self-promotion here, but I did take a look at the data and attempt to aggregate the data myself (as best I could): https://seasmith.github.io/blog/reimagining_a_data_viz_good_schools_affordable_homes_nyt/

Also, the data is hidden behind an XHR, in a file called districts2.json, if I recall correctly. I didn't bother to dig into the Stanford data.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep