As an engineering undergrad, we had to take prerequisites in mathematics, physics, and chemistry. We are expected to possess a base level of scientific knowledge. It's like studying the classics for English majors. I'm surprised that it is not routine for social sciences majors to take a Census 101 course (tell me I'm wrong about this). Everyone should have a base level understanding of U.S. demography. Stuff like this:
These are population pyramids that show the distribution of population by age group and gender simultaneously. In each pyramid, the red and blue bars together sum up to 100%. The small-multiples format (see Junk Charts posts) brings out the comparison between racial groups: the top left chart shows whites, which essentially is the aggregate U.S. pyramid because the minority racial groups are still not large enough to matter (whites are about 70%, blacks and Hispanics each 12%, Asians 4%).
The key learning is that the minority racial groups all tend to skew young. In particular, focusing on the top row (whites v. blacks), we learn that about 34% of blacks are under 18 compared to 24% of whites while about 4% more whites than blacks belong to the 55-64 and 65+ age groups. (Note that annoyingly the scale chosen for the four pyramids are not the same. Despite its appearance, the Asian pyramid is quite similar to the one for whites.)
What incites this post is the press release by Nielsen titled "African-Americans, Women, and Southerners Talk and Text the Most in the U.S.". It has been all over the blogosphere (thanks to my friend Augustine F. for alerting me to it). For example, Gizmodo republished this as "Black people call and text way more than everyone else", with this chart shown on right. Curiously, the Nielsen article contains no graphics on either race or gender despite the prominence of these two factors in the headline.
The implication of such an analysis is that race is the main factor causing people to call and text more. This sort of thing I have called "true lies". The statistic is true but it creates a false impression.
Usefully, the Nielsen article contains a chart on differential usage by age group, which I reproduce here. Notice that the variance across age groups is much more pronounced than the variance across racial groups.
Teenagers text more than twice as many as 18-24-year-olds, who text more than twice as many as 25-34-year-olds, who text more than twice as many as 45-54-year-olds, who text three times as many as 56-65-year-olds, etc. In terms of voice minutes, 18-44-year-olds call 900 minutes or more compared to under 600 minutes for 55 and over.
If the analyst has taken Census 101, he or she might put two and two together and realize that minority racial groups skew young. So the question is whether blacks call and text more because they are black or because they are younger.
At this point, the analyst should refer back to the raw data, looking at the numbers at the race-age level. Not having the data, we make do with an estimate. By inspecting the pyramids, we compute that blacks have more 18-year-olds than whites by 10%, while whites have about 2% more 45-54-year-olds, 4% more 55-64 and 4% more 65+. So based on this age skew alone, we expect blacks to have an average of 270 more text messages! (Just do a weighted average of the excess proportions and the average text messages in each age group). The actual difference according to Gizmodo between blacks and whites was only 114 messages. So their headline of blacks texting more is misleading... the story is that blacks are disporportionately younger and youngsters text a lot more.
The story of voice minutes is different from that of text minutes--the opposite of what Gizmodo claimed. For voice minutes, the average for blacks (of all age groups) is almost double that of the white average. Referring back to the age groups for which the proportions of blacks are different from whites, the blue bars in the Nielsen chart tell us that the average minutes in those bars are not that different (631 for under 18, 587 for 55-64, etc.), and thus age-group variation cannot explain the large gap in voice minutes.
It is still premature to claim blacks call a lot more than whites because of their race. One should check other factors like income groups, education levels, etc. to make sure that something like the text message story is not repeating itself.
If you have the raw data, what you'd do is to look at black-white differences at each age group separately; if blacks do text a lot more than whites at most/all age groups, then that's evidence for the Gizmodo hypothesis. Of course, given the quick estimate above, we already know that is not the case with texting. For those who read Numbers Rule Your World, this is the key statistical concept of Chapter 3.
For comments on the graphic used in the press release, see the related entry on Junk Charts.
The population pyramids came from the report called "Youth Demographics" by Mark Hugo Lopez and Karlo Barrios Marcelo, published November 2006.
PS. For those paying attention, the reason why the quick estimate is not a perfect substitute for getting the raw data is that I have to assume there is no "interaction effect" between age and race, in other words, I assume that the average within an age group is the same for blacks as for whites but in reality, it is possible (though unlikely) that white teenagers are more like white adults than they are like black teenagers.