Felix Salmon laments that the standard of statistical literacy among journalists is appalling. He brings us this example about people who download music illegally:
About 95 percent of music downloads in 2010 were unlicensed and illegal, with no money flowing back to artists, songwriters or record producers, according to Alex Jacob, a spokesman for the International Federation of the Phonographic Industry. So riches could await a company that persuades some of these Internet scofflaws to change their ways.
He argues, rightfully, that the source makes the statistic (from an organization fighting against music piracy) hard to take seriously. He also feels that the number fails the "sniff test". Could it really be true that only 5% of the music are paid for? If the music market were to be say $10 billion today, then they are suggesting that the real market could be as high as 20 times that number, so $200 billion.
I left a comment on his blog to point out the craziness of the last sentence in the quote above, the delusion that if Bittorrent and other illegal download methods suddenly vanished, the online music revenues would jump 5-fold, 10-fold, etc. overnight. This is the same delusion that makes politicians/economists claim that we can solve our unemployment problem by giving our workforce more college degrees (in effect, shifting people from one bucket to a different bucket). I debunked that claim here.
The other point Felix made is well worth repeating: 95 percent of downloads is not the same as 95 percent of people doing downloads because a small number of people account for an outsized proportion of total downloads. Although Felix didn't state this directly - he assumed it in an example, it is most likely true that illegal downloaders are on average downloading many more songs than legal downloaders. For price is no barrier to the former group.
What this means is that if 95 percent of downloads were illegal, then the proportion of people who are illegal downloaders is likely to be considerably lower than 95 percent.
One final point: it is also foolhardy to bluntly divide the world into Illegals and Legals. In statistics, we like to think there is a continuum with most people having done at least one illegal download, while perhaps most so-called Illegals have paid at least once for music downloads. So, if we want to do a proper analysis of this phenomenon, we should put a probability of downloading illegally on each individual, rather than assuming that each person is either an Illegal or a Legal, and not both.
Yes, this makes data analysis sounds complicated. It's easy to fall into the many traps. But there is only a small number of fundamental concepts, and once you understand those, you'll find them popping up everywhere.