You can follow this conversation by subscribing to the comment feed for this post.

Overall, everything you say makes perfect sense!

One small follow up question: Separate and apart from whether or not the base rates are correct, and the lack of attention to variation of those base rates, how do you know that a 1% or 2% difference in click rates is "tiny"? I don't know anything about e-mail marketing, but in direct (snail) mail marketing, a difference of 1% response rate is considered quite meaningful.

Stephanie: it depends on your reference point. 1% over 85% is likely to be noise but 1% over 5% is probably meaningful. One good way to gauge the underlying variability is to look at how much the rate moves around weekly, monthly, etc.

I still don't see how you could know what is "tiny" unless you have actual information on the variance of these particular data. Some data are quite stable, in which case a 1% change may well not be noise. Seems like you are guessing (perhaps in a reasonable, educated way if you have experience with this type of data), but who knows? I would not assume one way or the other myself. For example, assume most people have very predictable behavior and are either routine clickers or routine non-clickers. If there's something particular about a certain e-mail provider's setup that nudges a subset of people to switch behavior, well, there's a story of how 1 or 2% isn't noise.

How would I know? It's what I call numbersense. I wrote an entire book about it.
I'll just mention one aspect of it here. Go read the book for the rest.
There is no such thing as certainty in statistics. Every analysis, including the one derived from "actual" data, is part opinion.

A philisophical question on approach here. Part of the false sense of confidence in the data is supported by the way confidence intervals are created, no? Assuming we're using sqrt(p(1-p) / N) as N grows the confidence interval must shrink lending the assumption of statistical significance even if the practical significance of a 1% difference in click rate is small. This assumes any test for statistical significance was done, of course.

Would you advocate for a sampling rather than N=All approach to counteract this? Setting aside the "design" of the study (using someone else's data, lack of control), when is a lot of data a good thing and when is it simply misleading? Or is a lot of data only misleading due to the "design"?

Thanks!

Subsampling like you said is something I do routinely. There is a better method, not always feasible. What those formulas do is to use a stochastic model to estimate the systematic variance. No one really checks whether the estimate is accurate or not. When N is very large, as you noticed, that estimate is almost surely wrong. The solution is to compute empirical estimates of your own systematic variance. That's why I talked about looking at your historical fluctuations. Box,Hunter,Hunter (1st editions) covers this even before they talk about the usual formulas.

All great points to consider when looking at any email analysis.
One way to combat cross time changes for provider comparison purpose (or any other "segmentation") would be to have a random control group. Lift metric would be impacted by the Gmail or other providers actions independently. MailChimp would probably never recommend this approach because their revenue is tied to volume. Lifts in product sales from email will be small (but likely still positive ROI) which would push their client to consider optimization and better targeting tactics... but that realization in client is very good for MailChimp's business.

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

(Name is required. Email address will not be displayed with the comment.)

## NEW BOOTCAMP

See our curriculum, instructors. Apply.
Business analytics and data visualization expert. Author and Speaker. Founder of Principal Analytics Prep, MS Applied Analytics at Columbia. See my full bio.

## Next Events

Sep: 18 Statistical Communications (guest of Gelman) , NYC

Sep: 19 Raw Haus: Humanizing Leadership , NYC

Sep: 20 NBA Hackathon (judge) NYC

Sep: 26 Analytics Resume Workshop w/ NYPL NYC

See here

## Future Courses (New York)

Summer: Statistical Reasoning & Numbersense, Principal Analytics Prep (4 weeks)

Summer: Applied Analytics Frameworks & Methods, Columbia (6 weeks)

## Junk Charts Blog

Graphics design by Amanda Lee

## Search3

•  only in Big Data