We all know about the game of telephone. If one person says something to another, and the second person to a third, and so on, the message received at the end of the chain will be substantively different. What gets me thinking about this today is an article by Gelman and friends, which essentially describes the game of telephone as it applies to data messaging: "Through endless repetition, numbers of dubious origin take on the veneer of scientific fact, in many cases in the context of vital public-policy debates."
I love their leading example. Apparently, (at least one) university in the U.S. circulated a public-service notice telling people to wear a hat on cold days because "most body heat is lost through the top of the head." The authors attempted to trace the origin of this message, and discovered that the advice purportedly came from a statistic, which claims that 50 or even 80 percent of body heat is lost through the (apparently cap-less) head.
[Most of the article is about tracing statistics on "corruption", and about the concern that politicians mislead the public using junk statistics.]
***
This little scenario reveals a number of things about our relationship with data.
First, repeating data seems to follow what Kahneman calls "fast thinking", not "slow thinking". People tend not to slow down, and apply logic, and ask: could we really be losing 80 percent of body heat through our head? Could a human being survive after losing 80 percent of its body heat? (Apparently, the more correct answer is 10 percent.)
Second, because of the fad built upon "unconventional wisdom," spread by the likes of Malcolm Gladwell and Steven Levitt, people may believe illogical data -- the more bizarre, the better. Shock value is the currency of social media, so this fad lives.
Third, I'm interested not just in the likelihood that the message has changed through repetition, but in what direction the message shifts. My working assumption is that the default setting of most people is story-first (as opposed to data-first). The way data are consumed is typically to support some hypothesis, i.e. preconceived notion. In the game of telephone, the message is continuously shaped by the story-first mentality of the messenger.
This idea is an expanded version of what the authors called "decorative statistics", i.e. making numbers "sound big and impressive".
Each person receives a message, but only partially due to memory failure. Each person has its own belief, such as about the effect of not wearing hats on cold days. The outgoing message will tend to be pulled toward those prior beliefs.
When I find some time, I should run a simulation using Bayes' Rule to propagate messages at each stage and explore the above idea more rigorously.
***
Another factor enables this game of telephone in data stories. Every statistic ever reported has nuances - it is only true for specific populations, after certain exclusions, accounting for particular biases, etc. However, in messaging data, we don't have the luxury of attaching two paragraphs of explanations for each number. Therefore, the nuances are lost.
The orphaned data are out in the open. This invites anyone to adopt these statistics and shape them to their own liking. Fill in the blanks. If one is completely rigorous, one plugs the gaps by researching the origins of the data, but who has the time, right? So, instead, one just adds whatever one thinks "makes sense"... this is where we ends up in a story-first mode, because making sense is in the context of that preconceived story. If we want people to wear hats, we like the story that hats keep heat contained in the body, and if we can't quite remember the statistic, we are susceptible to guesstimating a number that supports our story.
Comments
You can follow this conversation by subscribing to the comment feed for this post.