This graphic, unfortunately chosen, contains many distractions from the main message, including:
- the liberal sprinkling of colors
- the inclusion of data for 1, 2, 3, 4, 5, 6 days, almost all of which were effectively zero
- the redundant vertical scale, as all the data already appeared on the chart itself
- the comparison of smokers to "total sample" (rather than non-smokers)
The last point merits special attention. The total sample contains households with smokers as well as households without smokers. Any data from the total sample is a weighted average of these two types of households. It is better to directly compare the two household types than to indirectly compare one type to the overall.
Further, households without smokers should be extremely likely to have no smoking in residence all week. And if most households have no smokers (76% of this sample), then the statistics of the total sample will mimic those of no-smoker households. That is to say, the total sample statistics do not add much to the analysis. Our junkart version below corrects for this as well as other things.
One of the key functions of a graph is data reduction, i.e. to aggregate data in such a way as to expose the information contained within. Typically, a graph that uses aggregated data is clearer and stronger than one that plots every piece of data. In this example, by combining 1-6 days into a single category ("smokes in residence part of the week"), we have a graph that is much more readable.
I want to thank Dr. Mike Rabinoff for inspiring me to look up these second-hand smoking statistics. Mike recently published a book called "Ending the Tobacco Holocaust", which tells you more than you want to know about the tobacco industry.
Reference: "Second Hand Smoke Survey: Final Report", Madison Department of Public Health, Dec 2003.