The nature of variation 2
The Crossover Law of Petropolitics

Google trends

Google Trends is both fascinating science and a dangerous tool.  The following example is lifted from Andrew Sullivan's blog.

Times_v_blogs776285First off, this is a supreme example of turning volumes of data into useful information.  (If you data-mine, you'll understand the amount of work needed to generate something like this, in automated fashion.)  The chart provides a comparison of the volume of traffic for different search keywords over time.  The lines are sharp, and some well-chosen amount of smoothing is applied so that some spikes are seen but not too many.  The concept of flagging certain "special" points is also admirable.  No wonder this caught the attention of lots of marketers!

However, user beware!  For unexplained reasons, all of the information required to interpret this chart is missing.  The vertical scale is missing, which means that we do not know how many searches include the word "blog".  While the relative gap between the lines is large, the absolute difference may in fact be tiny.

Also, what sample size was used?  How were the samples selected?  This gets even more tricky because Google then categorizes the results by cities, regions and languages.  Do they have enough samples to make meaningful statements at that level of detail?  Similarly, on the time scale, what kind of smoothing was employed?

Times_vs_blogs_2The special flags, while a wonderful concept, fall flat in practice, highlighting the limitations of machine intelligence.  On the right, I copied the headlines for the flags.  You may also be bewildered at the choice: not a one has anything to do with comparing NYT and blogs.

Such half-baked tools are very dangerous indeed, as demonstrated by Andrew's comment.  Andrew is one of the pioneers of news blogging who eloped from mainstream media, thus his bias is well known.  Using this chart, he proclaimed: "They're [NYT] doomed."

Not so fast.  It is unfair to "spread their votes" by using "new york times", "nytimes" and "ny times" as three separate entries.   Times_vs_blogs_3Besides, NYT is only one publication; pitting it against a world of blogs is absurd.  Especially when the top 8 regions searching for "blog" are outside North America!  (see the light blue bars on the right)

Meanwhile, this bar chart is also impossible to interpret.  By "normalization", one assumes they are removing the effect of the total number of searches, or else the US will always end up at the top.  Normalization is forever a double-edged sword: if you are the marketer, even if you see Peru as having the highest % of searches using "blog", you can't conclude that Peru is the market you should go after, since you may be worried just how widespread Internet/Google penetration is in Peru.  By hiding the scale (again), Google Trends stubbornly remains just a toy.



Comments

Hagrin

I usually don't post to disagree with an article, but I really don't see the merit of this article. Just because a user doesn't understand the tool and what its intended purpose is doesn't make the tool "dangerous".

End of the year Zeitgest numbers were never for marketing purposes, but general "cool" numbers to show Google readers what popular search terms were for the year. Google Trends just makes this information available at any time where you can customize your search parameters to see what's more popular on a relative scale. Without the y-axis being labeled, that's all Google Trends is - a relative analyzer.

If you use a hammer to cut a piece of wood, you deserve the resulting splintered wood.

Sala

I generally agree with what you say. Comparing the word "blog" with "New York Times" doesn't seem to make sense.

What I find interesting, however, is the comparison of the search and the news volume. In this graph, it is obvious that the word "blog" has a higher impact on web searches, and the word "New York Times" has a higher impact on the news. But look at the the graph I posted a couple of days ago, and I think google trends has something interesting to tell about how the media works.

Andrew

I don't see Google Trends providing any data that will have any "real" use. There are too many unknown parameters and the results are too easy to manipulate.

I like to perform Google Trends searches on different topics just for fun and post the results at my blog http://whatgoogletrendstaughtmetoday.blogspot.com/.

I definitely do not take the data/results seriously.

Peru-Andrew

I wouldnt say dangerous is the word but anyway... In fact bolging in Peru has become quite popular. There are some ver odd things about Peru like for example having the highest concentration of Internet cafes in the world. Quite probably, it also has the highest ratio of bloggers per 1.000 inhabitants!
Interesting stuff but not dangerous at all.
Cheers.

João Acabado

So if we concluded from google trends that Peru had the most blogger ratio per capita, we would've been right.

Dangerous ... who fears the error here?

last chaos gold

Interesting stuff but not dangerous at all.

comments system

Yes Andrew I also think so that Google Trends don't providing any data that will have any "real" use. There are too many unknown parameters and the results are too easy to manipulate.

The comments to this entry are closed.