« Come hear me talk about A/B Testing next Tuesday #optimizely | Main | Trifacta, an attempt to simplify the analyst's life »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Jordan Erickson

Great 538 article! I feel like I'm getting a behind-the-scenes view into how you did the analysis. If you are able, could you share with us what technique you used to determine the relative influences of variables/factors? (In the "What Do Health Inspectors Care About?" graph.)


Jordan: Glad you liked it. If you open the hidden "footnotes" section of the article, you will see some further description. In short, the chart is based on a classification tree (aka decision tree) algorithm, using those factors to "predict" the grade. I also tried using logistic regression, which gives a similar result... except that it's harder to summarize regression output in this case.

Jordan Erickson

Ah, didn't see the hidden footnotes. That helps. Now I just have to wait until I take my class on classification trees (in grad school)...


What does it mean "All rating schemes will be gamed to death"?

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)


Link to Principal Analytics Prep

See our curriculum, instructors. Apply.
Business analytics and data visualization expert. Author and Speaker. Founder of Principal Analytics Prep, MS Applied Analytics at Columbia. See my full bio.

Next Events

Oct: 31 Webinar on Data Visualization, online at JMP

Nov: 1 NYU unCOMMON Salon Public Lecture, New York, NY

Nov: 8 Tufts Gordon Institute: A Conversation with Kaiser Fung, Facebook Live

Nov: 8 Tufts TGI Careers & Networking Night panel, Somerville, MA

Nov: 26 Data Visualization New York Meetup, New York, NY

Nov: 27 NYPL Data Analytics Resume Workshop, New York, NY

Nov: 30 Purdue School of Engineering Seminar, West Lafayette, IN

Dec: 1 Purdue Mathematics, Data Science, and Industry Conference, West Lafayette, IN

Past Events

See here

Future Courses (New York)

Summer: Statistical Reasoning & Numbersense, Principal Analytics Prep (4 weeks)

Summer: Applied Analytics Frameworks & Methods, Columbia (6 weeks)

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee


  • only in Big Data