Professor Andrew Gelman is a pioneer in statistics blogging. His blog is one of my regular reads, a mixture of theoretical pieces, applied work, psychological musing, rants about unethical academics, advocacy of statistical graphics, and commentary on literature. He's one of the few statisticians who gets opinion pieces published in the New York Times. His expertise is statistics in politics, but I also enjoy his work on the stop-and-frisk policies of NYPD, debunking of evolutionary psychology and ESP reseaerch, etc. (I also recommend our collaborative piece on the Freakonomics franchise.) He is a co-author of Bayesian Data Analysis, one of the most influential textbooks on Bayesian statistics; the fourth third edition is out now soon (link).
Here is my interview with Andrew.
KF: Andrew, you have impeccable credentials, degrees from MIT and Harvard, Professor at Columbia, Fellow of ASA and IMS, etc., but in my experience, having degrees doesn't automatically prepare one to do great applied data work like you have done. What's your secret?
Here's one thing regarding "great applied work": Ask yourself the question: What makes a statistician look like a hero? You might think that the answer would be, Extracting a small faint signal from noise. But I don't think so. I think that a statistician looks like a hero by studying large effects.
Statisticians have been studying ESP for decades, trying to tease out tiny signals amid masses of noise, and they just look like chumps. But in the projects I've worked on that have been successful, we've been aiming at big fat targets--things like incumbency in elections, or the effects of redistricting, or predicting home radon levels (not such a hard task; radon levels vary a lot by geography), or measuring the number of friends people have, etc etc etc. In these problems, my statistical successes have often come from methods that have allowed the combination of information from different sources. Often what is important about a statistical method work is not what it does with the data, but rather what data it uses. Good methods have the flexibility to incorporate more information into the analysis.
I've picked up skills over the years, and I'd definitely say I'm better at data analysis and statistical reasoning than I used to be. On the other hand, whenever I've tried to design an experiment for my substantive research, I've failed miserably. I've had lots of success in my research in social science and public health, but almost all involving the analysis of existing data. I attribute my inability to design an experiment to a combination of lack of practice and lack of natural talent. Measurement is central to statistics and is a completely different thing than data analysis.
KF: What is your pet peeve about published data interpretations?
I've called it the lure of certainty. It's a problem with researchers and with consumers of researchers as well: they don't want to acknowledge uncertainty and variation. There's lots of talk about the treatment effect without a sense that it can and will vary, that a small positive effect in one setting might be negative in another, and that existing data might not be enough to determine its sign, even in the context of the data collection. I get so frustrated when people take a "p less then .05" statistically significant result as definitive evidence.
Here's an example. Recently the journal Psychological Science published some papers claiming scientific results based on college students and participants in the online Mechanical Turk system. I criticized these results for lack of generality and the response was that this is standard practice, that it's naive to criticize a psychology study for having a nonrepresentative sample.
In this case, though, I think the naive criticism is correct. If you're estimating an effect with large average size and small variation, then it's ok to use a nonrepresentative sample. But in these social psychology examples, we have every reason to believe that main effects are small and interactions are large--that is, effects could be positive in some groups and negative in others. In such settings, a nonrepresentative sample can kill you. But people don't want to think about this, because it's more comfortable to think about effects as "real" or "not real" without acknowledging variation.
KF: What sources do you turn to for reliable data analyses?
KF: Thank you.
Previous Numbersense Pros: