« My pre-existing United boycott, and some musing on randomness and fairness | Main | Data sleaze: Uber and beyond »


Feed You can follow this conversation by subscribing to the comment feed for this post.


"It seems likely that the difference between the 5th and 7th grader is not statistically significant - which, if true, is a sad commentary on our education standards."

Is it? I think it doesn't necessarily mean that kids aren't learning between fifth and seventh grades; it could just be that variation among individuals of a certain age is really wide--fifth graders don't all cluster tightly around "grade level" with a few gifted/delayed outliers in each direction. I was surprised by the three-standard-deviation comment, but when I thought about my schooling experience I realized that it made a lot of sense.


Hi Jeff: I'd think the variability among 5th graders should be similar to the variability among 7th graders but because of selection effects, that may be false. However, I still stand by the original statement - remember that the 7th grader here is taking the 5th grade test, not a 7th grade test. Also, in that last sentence, you mean three-grade-levels not three-standard-deviations; the researchers say one SD is roughly three grade levels.


Thanks--yes, I understand the distinction; I just misphrased it.

I suppose I agree with you about the state of our educational standards if you are critiquing the way we are defining and measuring success. The expectation that everyone will be in the same place at the same age is unrealistic. What people usually mean when they say something like this, though, is that our expectations aren't high enough, or that the instruction isn't sufficiently rigorous, or some such. This may also be true, but is a wide range in ability relative to the pace of the curriculum evidence of it? Given all the factors in and outside of school that can affect student performance, it's not obvious to me that it is.


Jeff: I am also an educator so this is self-critique. One factor not discussed here is the measurement instrument... I did not investigate what tests are used to calibrate the grade level equivalent scale.

I do think that on the first order, we want the average student in 7th grade to do better than the average 5th grader on a 5th grade test. In theory, a 5th grader who progresses to the 7th grade should be accumulating knowledge, so the entire score distribution should be moving up (variability may change though as some individuals may progress faster). But as you pointed out, the system is complex, and we are comparing different cohorts, so the reality may not be as clearcut.

I am mostly talking about the "average" student which doesn't exist but is a good aggregate measure. This is distinct from variability, which you are very concerned about.


Minor nitpick: if you refer to "item 2" or "item 3" in a bulleted list, it is easier to see those items if the list is numbered rather than simply using dots...

PS: I hate to be "that guy".


D: I fixed it, just for you :)


I think drawing an indifference curve is a terrible suggestion, because commute length data is garbage. It is not a proxy for distance to New York; one can just as well put Topeka on that graph; it would be a green dot, indicating a short commute.

Better to bin the whole thing!

Luke Smith

Some self-promotion here, but I did take a look at the data and attempt to aggregate the data myself (as best I could): https://seasmith.github.io/blog/reimagining_a_data_viz_good_schools_affordable_homes_nyt/

Also, the data is hidden behind an XHR, in a file called districts2.json, if I recall correctly. I didn't bother to dig into the Stanford data.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Business analytics and data visualization expert. Author and Speaker. Founder of Principal Analytics Prep, MS Applied Analytics at Columbia. See my full bio.

Next Events

Oct: 31 Webinar on Data Visualization, online at JMP

Nov: 1 NYU unCOMMON Salon Public Lecture, New York, NY

Nov: 8 Tufts Gordon Institute: A Conversation with Kaiser Fung, Facebook Live

Nov: 8 Tufts TGI Careers & Networking Night panel, Somerville, MA

Nov: 26 Data Visualization New York Meetup, New York, NY

Nov: 27 NYPL Data Analytics Resume Workshop, New York, NY

Nov: 30 Purdue School of Engineering Seminar, West Lafayette, IN

Dec: 1 Purdue Mathematics, Data Science, and Industry Conference, West Lafayette, IN

Past Events

See here

Future Courses (New York)

Summer: Statistical Reasoning & Numbersense, Principal Analytics Prep (4 weeks)

Summer: Applied Analytics Frameworks & Methods, Columbia (6 weeks)

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee


  • only in Big Data