MailChimp, a major vendor that companies use to send marketing emails to customers, published an analysis of the effect of Gmail marketing tabs (link). How should you read such a study?
I'd begin by clarifying what problem the analyst is solving. In May, Google rolled out to all Gmail users a tabbed interface, in which the inbox is split into three parts: the regular inbox, a "promotional" email box, and a "social" email box.
Immediately, everyone assumes that this change will hurt email marketers (we are talking about legitimate companies, not spammers here.) The MailChimp analyst is using data to validate this hypothesis, which is a wonderful endeavor in the spirit of this blog.
Next, I'd identify the analysis strategy used to arrive at the answer. This analyst is using a pre-post analysis while controlling (ex-post) for a single factor. In layman's terms, that means the analyst compares the open and click rates before the tabs rollout with those rates after the tabs rollout. But that difference can be misleading because the pre-post analysis by itself does not prove that the tabs rollout was the cause of any observed difference. For example, there may be a seasonal change in open rates regardless of the tabs rollout.
Recognizing this, the analyst used other email providers as a natural "control" for this single factor (seasonality). The idea is that if seasonality were the cause of the change in open rates, then the other email providers should exhibit the same seasonal change over the same time window. This is a reasonable supposition but you might already be questioning... why must the seasonal effect be identical across email providers?
Good question! It doesn't have to be, which tells you that the outcome of this analysis is valid only under the assumption that the seasonal effect is identical across email providers. (See post #2 for other strong assumptions needed due to controlling for only one factor.)
Once I am satisfied with the analysis strategy, I look at the quality of the data. I did notice one red flag here. Looking at the click rate chart (please imagine that this is a line chart, not a column chart with an axis not at zero!), I am shocked that the average open rate was in the 85% range. This is saying that almost all of the people who open emails click on something inside the email. Since I have seen email clickthrough data before at various companies, I am skeptical that these rates are correct.
I did leave a comment at the blog asking them to check their data but as of today, it looks like it got lost in cyberspace - or censored. My friend who originally shared the blog post left a comment and it went through.
The analyst seems to have little sense of what real-world clickthrough rates look like! He convinced himself that the rate must be correct since it is what the data say, and further threw in a distraction -- that there are two ways to measure click rates, one is based off the number of emails sent and the other is based off the number of emails opened. Not surprisingly, the latter is much higher than the prior number.
By his count, the number of clicks to emails sent is in the 10 to 20 percent range. That too is way too high. If you tell me there are a few email campaigns that achieve such a high rate, I'd believe it. But given that his study is "BIG DATA", with 29 billion emails, 4.9 billion opens, 4.2 billion clicks, and 43.5 million unsubscribes, presumably across a large number of clients and many different industries, it is hard to fathom what it means to say one out of every five to ten emails get clicked on.
I'm not bashing the analyst here. Every data analyst will encounter this type of situation over and over. You are convinced that your number must be correct - because you know the data, you know the steps you took, you know the care you took to compute the rates.
When someone else points out the rates don't sound right, you're scratching your head. You know it's just a simple formula, the sum of clicks divided by the sum of opens, so you think there are only a few ways it could go wrong. Further, the person raising the doubt has no data so what could he/she know?
In reality, there are many ways to skin the cat of a simple formula. Have the data been cleansed of bots, and suspicious clicks? What are the time windows for counting each item? How are multiple opens or clicks by the same entity treated? etc.
This is the test of how good an analyst someone is. This is when the analyst demonstrates numbersense. How much time does it take to figure out what is driving these numbers crazy?
The reason I'm not bashing the analyst is this: I'd say if you tally up each time the person with no data raises doubt about analytics data, I'd say probably 80 percent of the time, the data is fine; and possibly 5 percent of the time, the data has serious errors (defined as, the conclusion changes after the fix).
Of course, if you are a manager of a data team, you want to manage to those ratios. If your analysts are wrong much more often, some remedial action should be taken to improve the performance.
In my next post, I'll look at the MailChimp study from the perspective of Big Data.