On page 8 of Numbersense (link), I wrote:
Web logs are a messy, messy world. If two vendors are deployed to analyze traffic on the same website, it is guaranteed that their statistics would not reconcile, and the gap can be as high as 20 or 30 percent.
Insiders will nod their heads; for those who aren’t familiar with Web data, take a look at this recent post on The Verge about the metrics of Web traffic, whimsically titled "Yahoo is bigger than Google, but Google is bigger than Yahoo." This post is a reaction to the recent announcement that Yahoo sites have surpassed Google sites in terms of traffic--well, according to one way of counting.
***
Note that Web companies like Google, Yahoo!, and Facebook live and die by these numbers. All these companies offer “free” services to users, and they generate revenues by selling (out) these users to advertisers. Advertisers want “reach,” which is typically measured by such metrics as unique visitors and unique visits. When you hear people make the claim that digital advertising is more “measurable” than traditional advertising, they are referring to this type of metrics.
The Verge reporter compares three measurement services: Comscore (the dominant player), Compete and SimilarWeb. These services could not agree if Google or Yahoo! was bigger or if they were the same size. Moreover, their numbers differ by orders of magnitude.
***
Read between the lines, and you realize that these services don’t agree on the definitions of those metrics either. What is a unique visit? Conceptually, a visit is a browsing session. In reality, it’s not easily nailed down. Say, you go home from work and start watching a movie on Netflix. Half an hour later, your Mum calls, and you pause the movie. Another 30 minutes later, you finish the call, and continue watching the movie. If you ask me, I’d say only one Web visit should be counted. On the other hand, if after the call, you abandoned your movie, and checked your email, I’d argue you started a second unique visit.
The reality is Comscore, etc. does not have a CCTV installed in your home. All they have is the Web log, which has an entry for every page you viewed and the time you went there. How much idle time should occur before one declares a new visit? If you set this to 30 minutes, then in the example above, you’d have counted two unique visits. If you set it to 180 minutes, you’d have counted only one visit.
If you believed the hype that online businesses are inherently more measurable, think again. They may be measurable in the sense that you can more easily collect data, but the data is only approximate, and vary wildly by who’s collecting it.
***
PS. I’m not against subjectivity in measurement. In fact, I believe every metric has subjective elements. I’m just saying you should find out how things are measured. This is one of the key takeaways from Numbersense (link).
I wanted to share a video that I think can be helpful for your readers that deals with planning and executing a Big Data program. (http://www.youtube.com/watch?v=Ow76L0IEZNY) This video is based off of TEKsystems research and delivers the message in a cute way through multiple sci-fi references.
Posted by: Alan Lucaz | 10/02/2013 at 05:08 PM
I thought the whole point of Google (and the primary factor driving its success) is that it made this whole broadcast-derived estimation of "webshare" irrelevant. Google makes advertisers bid for clicks, not charge on the basis of impressions (which is what I believe most advertisers did pre-google), let alone estimates of impressions. I don't know how FB or Yahoo charge now, but I can't imagine they can diverge much from the industry leader (Google).
Analysis of weblogs is another thing entirely, but I don't think Google or Yahoo are anywhere near living or dying over these stats, other than for PR and (maybe) for what they may suggest about trends. Google can count the pennies coming in with each advertising click, and that's what they live & die on.
Posted by: Gary | 10/02/2013 at 06:19 PM
Or I'm wrong. :)
Posted by: Gary | 10/02/2013 at 06:20 PM
Gary: Google purchased Doubleclick the giant of display advertising some years ago so it is in both the clicks and the impressions business. Clicks themselves also are not the ironclad metric you think they are. Start with click fraud or click farms. Followed by accidental clicks. Then, clicks that do not lead to purchases, and clicks that are misattributed.
Also, try to describe what clicks accomplish from a marketing/ advertising perspective, and you are in rather barren territory.
Posted by: Kaiser | 10/03/2013 at 02:03 AM