The New York Times covers several software companies whose business is to analyze databases of documents and emails in support of litigation. Large corporations are often sued and they have to submit troves of documents and emails to lawyers who charge hourly rates to inspect and summarize the information. Bills could run up to millions in large lawsuits. These technology upstarts claim to be able to replace humans with computers.
I recommend reading this article... with a critical eye. Don't expect help from the journalist (John Markoff) who seems blind to, or incapable of, evaluating the limitations of the technology. The tone and content of this article is typical of anything in the technology pages of our news media: one senses awe, and unbounded optimism, as if the articles were press releases issued by the companies making the technologies.
Here are a few points to ponder:
I am not saying these companies are hawking vaporware. For some tasks, computers clearly can do a much better job than humans. I just think that our technology reporters can serve us better by covering both the promise and the limitation of new tools. They can start by interviewing users of such software, both satisifed users and unhappy users.
Having picked up and breezed through David Moore's book The Opinion Makers, I am reminded of the promised second installment of my "close reading" of Charles Seife's Proofiness. Previously, I covered the first part of the book; the second part is the first part applied to political issues. (My summary review of Proofiness is here.)
Moore was a former Gallup pollster; perhaps he would consider himself a "reformed" pollster. He shares the same view with Seife as it relates to political polling: that polling has been usurped by the political apparatus, and abused by the mass media; that polls do not reflect the so-called "public opinion"; that politicians are foolhardy to rule by poll results; and that journalists (and also politicians) create polls to validate their agendas or preconceived stories. In short, they think polls usurp democracy, an ironic conclusion as their inventors had promised the opposite.
Moore's book is short (about 150 pages) and feels even shorter because he has one main thesis only but what a powerful thesis -- all modern poll results are misleading because they include the views of large numbers of people who have no knowledge or interest or conviction on the topic at hand; when a category of "don't know" is reported, it is almost always a criminal underestimate of the real portion of people who don't know or care. In poll after poll, the real insight is that most Americans do not follow major political events and thus do not have the knowledge or have any stable opinion of pretty much anything. But the pollsters don't tell you that.
This trickery is accomplished in two ways. First, the so-called "forced choice" design states all questions as yes/no, and the reported "don't know" comes about only if the respondent refuses to go with the two choices. In the rare poll in which an explicit "don't know" is available as a choice, the number selecting "don't knows" is manyfold higher.
Second, to forestall "don't know" type answers, the pollsters often feed information to the respondents, such as "This morning, Egyptian strongman resigned. What do you think ...?" Those people who are not aware of this development become aware of it by answering the poll! As Moore rightfully asserts, "Once respondents have been fed any information, they no longer represent the general public (p.146)
One measure of the failure of polling is that the respondents are found to have very low conviction on their expressed opinions. When asked whether they would be "upset" if the policy they oppose gets implemented, a surprisingly large number of respondents claim they don't care.
For the statistically-minded, I'd suggest jumping from Chapter 1 directly to Chapters 7-8 where Moore discusses some statistical topics. Chapter 7 contains some materials on low and declining response rates (80% in the 1960s to below 25% today), and an unresolved question of whether nonresponders are actually different from responders. Chapter 8 covers a variety of poll design issues, like how wording and order of questions affect response dramatically, and the rise of cell phones and the Internet. Seife, in his Chapter 4, has a nice treatment of the wording issue.
The other Chapters in The Opinion Makers details a sequence of examples illustrating Moore's main thesis. It forms a kind of chronology of major polling efforts in the recent past, including the controversies.
Both Seife and Moore look at pre-election polls that are used to "call" elections. They accept the "conventional wisdom" among pollsters that past errors are primarily due to "poor timing": voters change their minds between the time they were polled and when they cast their ballot. This is certainly plausible but in my view, they could have taken this line of thinking to its logical conclusion: that the average person doesn't treat our politics seriously enough to have any real opinions; positions they have are subject to change; thus, pre-election polls are a waste of time, possibly excepting the ones done within days of the elections. On exit polls, Seife does ask this question: whether our life would be diminished if exit polls didn't exist.
Both Seife and Moore lament that polls have become a "journalistic invention" used to create "pseudo-events" which allow journalists to write what they would have written without the polls. Moore believes the watershed was when the major polling companies started pairing up with national media outlets, losing their independent status.
Both Seife (Chapter 4) and Moore (Chapter 3) recount the lesson of the Literary Digest straw poll, which failed spectacularly in predicting Alf Landon to win the 1936 presidential election against Roosevelt by a landslide. The lesson is that a large sample is necessary but not sufficient for an accurate prediction; because Literary Digest drew its sample from automobile and telephone lists – neither good was as ubiquitous then as they are now – the sample was biased against lower-income people who were more likely to vote Democrat.
Seife took this as an opportunity to explain two types of polling errors, technically known as bias and variance. Variance is the variability of results from sample to sample, and is captured by the margin of error of the poll. Bias is a systematic error caused by selecting a sample that does not adequately represent the population. This is really a lovely real-life example illustrating the concepts.
The narrative then took a wrong turn when Seife stated: "[Literary Digest] thought their sample size made their poll accurate to within a tiny fraction of a percent, when in fact it couldn’t be trusted within ten or fifteen points." (p.107)
Unfortunately, bias is not measured as plus/minus points. A good analogy is an archer. Archer V(ariance) sprays his arrows around the bull’s eye, frequently missing the target but on average, his aim is at the bull’s eye. Archer B(ias) hits the same spot most of the time but because of a slight tilt in his motion, his arrows consistently hit a spot away from the bull’s eye. We can estimate V’s error using a concentric circle around the bull’s eye but to describe B’s error, we use the distance between the bull’s eye and the spot that B typically hits.
In the case of the straw poll, the concentric circle is negligibly small because of the enormous sample size; the bias should be measured as a shift downward of the percent voting for Landon – it is an error in one direction.
Seife’s coverage is broader than polls. He argues that our politics has been corrupted by bad statistics in a variety of other areas, including vote counts, voting schemes, gerry-mandering, census politics, ignorance of statistics by Supreme Court justices, and government propaganda, (Chapters 4 to 8). Both Seife and Moore stress that both political parties are guilty, and give ample examples to demonstrate their wayward behavior.
The second part of Proofiness covers materials that are less trodden by others, and is well worth the read.
A rich open question in polling is whether nonresponse constitutes bias. Moore seems undecided on this point as he cites nonresponse as a source of bias when explaining the spectacular failure of the Literary Digest 1936 straw poll while also describing research by Pew claiming that nonresponders are not meaningfully different from responders. This subject is, unfortunately, almost impossible to study because by definition, nonresponders don’t want to talk to us.
I'm continuing my close reading series with Charles Seife's Proofiness: the dark arts of mathematical deception. The previous series on the first 3 chapters of SuperFreakonomics can be found here, here, here and here. My overall review of Seife's book was posted earlier.
You should think of these pieces as reading guides. I record my thoughts as I make my way through the chapters. If I didn't find them worth reading, I would not have spent the time to read them with a magnifying glass. I am a stickler for precise language even for popular science books -- I understand the need to entertain but just like I've been saying at Junk Charts for years, entertainment should not get in the way of clarity. That is a tall order, and sometimes, despite valiant efforts, entertaining prose is not as clear as they should be. That's where I come in.
As discussed in my review, the first three chapters of Seife's book is an update to the "damned lies and statistics" genre made famous by Huff's How to Lie with Statistics. Seife is a professor of journalism, and I can only hope journalists would read his book and learn from it. Statisticians will be mostly happy with what Seife says here.
Book jacket and p.4: Seife's definition of "proofiness" is "the art of using bogus mathematical arguments to prove something that you know in your heart is true -- even when it's not." Something about this sentence confuses me a lot. Who is the "you"? Is he referring to the politicians, journalists, scientists, etc. whom he would later skewer for committing proofiness, or is he referring to readers who are interpreting numbers fed to us? Or, is he saying proofiness is a condition that affects us all, including the sleazy politicians and you and I? Is he talking about self-deception or deceiving others? Is the math "bogus" because it's incorrect or it's fake? I'd like to have a clearer sense of the headline concept of the book but at this stage, I keep my curiosity in check.
p. 7: Seife clears up the air with the first sentence of Chapter 1: "If you want to get people to believe something really, really stupid, just stick a number on it." Now I'm not sure if this captures everything he's planning to write about, but for me, this sentence works as a definition of proofiness.
He doesn't make a distinction between journalists who don't fact-check the numbers they cite, scientists who publish false-positive results, and politicians who invent numbers to push their agendas.
p. 9: "Two plus two is always four. It was always so, long before our species walked the earth, and it will be so long after the end of civilization.": This is just wrong. Arithmetic is a human invention. Besides, 2+2=4 only works in base 10 so it's not even always true today. (Read about systems of numbers here.)
p.10: Seife raises a very important point: that all measurements are inaccurate, have margins of error. This leads to the following observation:
To mathematicians, numbers represent indisputable truths; to the rest of us, they come from inherently impure, imperfect measurements.
I'm thinking, which side should statisticians put themselves? We identify as mathematicians but we also regard numbers as uncertain measurements.
p.17: He skewers marketers for making up crowd/listenership/circulation figures. Useful to think about the incentives behind this sort of bogus statistics. People who buy advertising are looking for "reach"; how many people will be exposed to the commercial is pretty much the only metric they have to put a value on the advertising campaign. Both the people who spend the money and the people who take the money have an incentive to claim the campaign has been a success! Also, think about the empty restaurant syndrome: many of us would turn away from a restaurant if we find it empty; there is a sort of self-fulfilling prophecy that crowds engender crowds. I'm not arguing for the morality of this; I hope he will address these behavioral issues eventually in the book.
p.17: In a section talking about estimates of the size of "tea party" crowds, Seife cites ABC, Sean Hannity, and Washington Post. The Post's number is his pick. This anecdote raises the all-important question of who should be the judge of whether a statistic is bogus or not? (In a different context, which poll is credible?) On the previous page, Seife reports that the park service has taken themselves out of the business of issuing "official" crowd sizes at public events. Not surprising at all because such a job is completely thankless: whatever number you provide, you will make enemies!
p.21: Starting to feel like I'm in a remedial class; he's still gnawing at the same point about numbers being approximations... but I soldier on.
p.21-2: Seife makes an excellent point about precise numbers giving a false sense of certainty. However, I don't like the two examples he uses to explain this. The first example is someone who reports that his car costs $15,000 versus someone who says $15,323; the second example is estimating one's own age as 18 years as opposed to 18 years, 2 months and 3 days. Seife tells us we should believe the people who give us rounded-up numbers more than those who give us precise numbers. This is a fine point, but unfortunately, in both of these examples, the people who give precise numbers are plausible; they are talking about their own cars, and their own ages. If, on the other hand, they are asked to guess the cost of their neighbor's car, or the age of the neighbor's dog, then Seife's point is well taken.
p.23: His next set of examples I also find wanting. He complains about Kofi Annan and the UN making a big fuss about pinpointing the Earth's 6-billionth inhabitant, and the Chicago Sun-Times declaring the 300-millionth U.S. resident (obviously impossible to know, just as it is impossible to know who is the first new-year's baby). The problem: both are inconsequential PR stunts. There are plenty of examples in which ignoring error bars lead to bad decisions that have consequences.
p.26: Seife likes to name things; proofiness, Potemkin numbers, disestimation, now "fruit-packing". Fruit-packing is one of: cherry-picking, comparing apples and oranges, and apple-polishing. One can hardly complain about reiterating them since these mistakes keep cropping up.
p.33: He gives examples of deception practiced by both the Bush people and the Gore people. Various techniques of deception are involved; delving deeper, I think the common spirit of all such tactics is misdirection: creating a general sense of suspicion by channeling attention to a trivial aspect of the data that has no significance to the conclusion. Because random chance is always part of statistics, any analysis can be trivialized in this manner. Don't fall for it.
p. 39-54: He's on the evergreen correlation is not causation theme. Seife looks at the correlation (better described as a coincidence) between the advent of Nutrasweet and a rise in brain tumors, and points out that the rise in tumors also coincides with a rise in deficit spending. Good illustration of why arguments based on "changepoints" (timing of a change) are shaky at best.
Readers must be careful with this discussion! Realize that the existence of one spurious correlation does not prove that all correlations are spurious. In other words, while we may agree with the implausibility that deficit spending could have caused brain tumors, that fact is not valid evidence against the Nutrasweet-causes-brain-tumor theory.
My readers will note that I disapprove of how most statistics textbooks discuss the correlation/causation issue: you are told what not to do, but left with no idea of what to do.
p.55: Seife describes Nature as the most prestigious science journal in the world but he proceeds in the rest of the chapter with a succession of examples of results, all published in Nature, that were later overturned on statistical grounds, which makes me wonder if its prestige is undeserved.
p.59: He turns to why extrapolation beyond the study population is not recommended. On safe ground here.
pp.79-80: Some comments about Enron and Madoff, no doubt to satiate the publishers. These are complex situations not easily covered in a few pages. Did people really have no clue about the risk of Enron? Or were they willing participants in a Ponzi scheme? Do individuals make their own investment decisions these days when most stocks are primarily owned by institutions (mutual funds, e.g.)? Do investment managers have an incentive to manage clients' risk exposure for the clients' benefit or for their own benefit?
p.81: I like how he explains the mortgage mess, but he strangely ignores the credit rating agencies.
p.86: The tragedy of the commons always makes good reading. I recently came across a variant of this phenomenon: a friend decided not to sell his house which is "under water" since it was bought near the peak of the boom--he opted to hurt his own economic interest in order to benefit the commons, to avoid the wrath of his neighbors because anyone who sells creates a "market price" for the entire neighborhood--if no one sells, there is no "mark to market". Here, the commons is the fictional book value of the homes, and the tragedy would be for individual homeowners to exit, which has the effect of reducing the value of the commons. My friend's behavior indicates that we cannot assume everyone would act on his own self-interest only.
The game theoretical framework is perfect for such problems. I believe that these problems cannot be analyzed without introducing a "morality" dimension, and recognizing that there are individual variability in "morality". It is almost sure, I believe, that someone in my friend's neighborhood would eventually sell, forcing all others to mark to market. The first one to sell is likely to fetch the best price. The uncomfortable truth may be that being moral is idiotic, that good guys finish last.
pp.87-90: He considers moral hazard as a form of "risk mismanagement". I agree it's a crucial topic that is being swept under the carpet by the political class but it's an entirely different thing from incorrectly computing the odds of something, mentioned earlier in the chapter. Seife's decision to not distinguish between deliberate deception and incompetent mistakes hurts his point here: the bankers who created the mortgage mess did not mismanage risk; on the contrary, they understood how the risk could be moved around and ultimately socialized.
In these pages, Seife also parrots the government's version of the bailout story. This creates the awkward juxtaposition of the supposedly unavoidable socializing of private losses, and the clear warning about moral hazard. What is left unexamined is why only two extreme solutions were considered: socialize all losses, and socialize zero losses.
The overly excitable people come out of the woodwork again, following this morning's information-less report on weekly jobless claims. Here are two of the breathless commentary:
Business Insider: Weekly jobless claims of 435K are way better than expectations
CNN Money: Initial jobless claims tumble to 4-month low
By "way better", BI is talking about a difference between 435K and 450K. By "tumbling", CNN is talking about a drop of 24K or 5%.
Here is why the announcement this morning contains no useful "information," as in anything that allows us to conclude that the job market is any better.
The two journalists both note that the 4-week moving average (average of past 4 weekly figures) dropped by 10K. This means on average over the past 4 weeks, job losses decreased by 10K per week. So the current estimate (almost always revised with more losses) of 24K last week is in the same ballpark as the previous 4 weeks. And note that the job loss number has to decrease at this rate for the next few weeks in order for the moving average to show a decrease at the 24K level. So there is no news here at all.
The graph below clearly shows that nothing remarkable is happening in the context of the recent history of this data series; we are bouncing around the plateau of about 450K losses per week, which as economists keep telling us, is not even sufficient to deal with the influx of working-age people.
(Image from Calculated Risk).
In particular, focusing on the tail of this data series, we see that the 24K change is not "way better," nor is anything "tumbling". The reporters rob these words of their meaning when they are used inappropriately.
For those who want to explore more, the government's press release is here. You will find that the entire drop in job losses comes from their "seasonal adjustment". The raw numbers show a slight uptick in job losses from the previous week of 30K. The adjustment was about 54K fewer jobs lost. I am not saying seasonal adjustment is a bad thing but it is important for us to differentiate between the "run rate" of job losses (which is post-adjustment) and the reality of job losses (the raw count).
No sooner had I written about "story time" than the LA Times journalists on the education beat announced "Story time!"
An article published recently on using test scores to rate individual teachers has stirred the education community. It attracted Andrew Gelman's attention and there is a lively discussion on his blog, which is where I picked up the piece. (For discussion on the statistics, please go there and check out the comments.)
In reading such articles, we must look out for the moment(s) when the reporters announce story time. Much of the article is great propaganda for the statistics lobby, describing an attempt to use observational data to address a practical question, sort of a Freakonomics-style application.
We have no problems when they say things like: "There is a substantial gap at year's end between students whose teachers were in the top 10% in effectiveness and the bottom 10%. The fortunate students ranked 17 percentile points higher in English and 25 points higher in math."
Or this: "On average, Smith's students slide under his instruction, losing 14 percentile points in math during the school year relative to their peers districtwide, The Times found. Overall, he ranked among the least effective of the district's elementary school teachers."
Midway through the article (right before the section called "Study in contrasts"), we arrive at these two paragraphs (my italics):
On visits to the classrooms of more than 50 elementary school teachers in Los Angeles, Times reporters found that the most effective instructors differed widely in style and personality. Perhaps not surprisingly, they shared a tendency to be strict, maintain high standards and encourage critical thinking.
But the surest sign of a teacher's effectiveness was the engagement of his or her students — something that often was obvious from the expressions on their faces.
At the very moment they tell readers that engaging students makes teachers more effective, they announce "Story time!" With barely a fuss, they move from an evidence-based analysis of test scores to a speculation on cause--effect. Their story is no more credible than anybody else's story, unless they also provide data to support such a causal link. Visits to the classes and making observations do not substitute for factual evidence.
This type of reporting happens a lot. Just open any business section. They all start with some fact, oil prices went up, Google stock went down, etc. and then it's open mike for story time. All of the subsequent stories are not supported by any data; the original data creates an impression that the author uses data but has nothing to do with the subsequent hypotheses. So be careful!
Currently (Tuesday), the top story on New York Times's website is the one about spinal taps as a predictor of Alzheimer's.
In short, the researchers are making claims of "perfection" (or near-perfection), that the presence of certain proteins in one's spinal fluid is certain proof that one will eventually develop Alzheimer's.
If you've read my book, especially Chapter 4 (and the associated stuff in the Conclusion), you should be able to think statistically about what is being printed on the page.
While I am quite sure that this finding is important (at least in stimulating further research), I don't think there is enough information for readers to be fully convinced. Every time someone trumpets "perfection", and particularly in forecasting and prediction, we ought to start from a position of skepticism. So in this spirit, here I go.
In this post, I focus my attention on how the numbers were reported in this article, and these two sentences in particular (I've numbered them for convenience):
 The new study included more than 300 patients in their seventies, 114 with normal memories, 200 with memory problems, and 102 with Alzheimer's disease.
 Nearly every person with Alzheimer's had the characteristic spinal fluid protein levels.  Nearly three quarters of people with mild cognitive impairment, a memory impediment that can precede Alzheimer's, had Alzheimer's-like spinal fluid proteins. And every one of those patients developed Alzheimer's within five years.  And about a third of people with normal memories had spinal fluid indicating Alzheimer's. Researchers suspect that those people will develop memory problems.
 - the numbers don't add up, and I'm confused by the placement of the commas. Is it that the "more than 300" were composed of three subgroups? Is it that there were four subgroups in the experiment, one of which consisted of people in their 70s? This inconsistency, in itself not deadly, can easily be fixed but it does smack of carelessness.
But what are the not-to-be-missed words in ? It is the qualifier in their seventies. Blink and you may miss it. This is of crucial importance because all of the study's subjects were the elderly who are most at-risk of developing Alzheimer's soon. Why is this important?
Recall that in Chapter 4, I discussed trying to pick a criminal out of a police line-up with say 10 suspects, versus trying to pick a thief out of a large-scale screening of thousands of employees at a company. It turns out the latter is a much more difficult task (because the chance of being a criminal is much lower here than in the former task), and predictive accuracy for this task is definitely worse than that for the other task, all else being equal.
Applied here, trying to predict who will develop Alzheimer's among people in their 70s is much easier than predicting among people, say in their 40s. So, we are a long way from full success.
In reporting the result, the journalist started with . Bad idea. I once took a class from an experienced journalist and learned that newspapers always print the most significant news first. Supposedly, if the editor then chops off the bottom of your article, it leaves the key points intact. The study is purportedly to find a test to predict Alzheimer's.  tells us that if we already know that the patient has Alzheimer's, then the test will be "nearly" always positive. This is a group of people that cannot benefit from this test, or any other diagnostic test for that matter.
 reads like the clinching argument for the lede ("a spinal-fluid test can be 100 percent accurate in identifying patients with significant memory loss who are on their way to developing Alzheimer's disease."). The second part of  can be made clearer, for example, by replacing the period with a semi-colon, or by stating that it's every one of the nearly three quarters.
Now recall, in Chapter 4, I discussed the inevitable trade-off between false positive errors and false negative errors in any diagnostic system.  tells us the positive predictive value (PPV) of this test is close to 100%, meaning that if someone tests positive, one almost certainly will develop Alzheimer's, or put differently, the false positive rate is low.
But this is not enough! One thing we learn from the above statement is that the proportion of 70-somethings with mild memory-loss conditions will develop Alzheimer's with at least 75 percent chance. The test will indeed be "perfect" if in fact the chance is 75 percent, that is, none of the people who tested negative eventually develop Alzheimer's.
What if the chance is 80 percent (this is known as the prevalence of the disease in the population)? That means 5 percent of those who will have Alzheimer's will not be detected, i.e. have a negative spinal-fluid test result. That's 5 percent among the 25 percent who test negative. Thus, the negative predictive value (NPV) would be 20/25 = 80 percent, which is good but not perfect. (Someone can look up the journal article when it is published and let us know what the actual NPV is.)
This then reinforces the earlier point... that when say 80 percent of the tested population will develop Alzheimer's, the prediction problem is not as challenging as one might think. Even if the test were to declare everyone positive, the error rate would be only 20 percent.
The bigger point is that prediction systems must be evaluated looking at three legs of a stool: PPV, NPV and the selectivity (how aggressively are people giving positive results?) I wrote about this before in the context of terrorist prediction, reacting to a section in SuperFreakonomics.
Finally,  leaves much to be desired. The group of most interest for the prediction problem is precisely this group of people who are not currently exhibiting anything unusual. I'd be interested in knowing the research design... did they decide before the experiment is conducted for how long they will track this group? Or are they tracking this group now waiting for the moment to declare victory/failure? In any case, the verdict is not in yet.
In a future article, I will make some comments about causation vs. correlation, as suggested by this research.
The headline writer tells us "Cold cuts could cause cancer: study". This is typical of all such headlines, and it always is frustrating because without fail, these studies make claims that are much more specific. Therefore, we must read the articles carefully, and also ask clarifying questions.
Start with: All cold cuts, or only certain types of cold cuts? Any kind of cancer or a specific type of cancer?
The reporter tells us the researchers found a "positive nonlinear association between red meat cold cuts and bladder cancer," which is a very different thing from "cold cuts cause cancer". Besides, they found no association between beef, hamburger, steak, sausage or bacon and bladder cancer.
Next: What patients? Male/female? Age?
Answer: men and women aged 50-71, in eight US states. The article did not state which states.
Statistical thinkers would also wonder about this statement:
The scientists also found that people who ate the most red meat were younger, less educated, less physically active, and had lower dietary intake of fruits, vegetable, and vitamins C and E than those consuming the least red meat.
Once the cause of a disease is found, it is natural to ask the question which types of people are most at-risk, and in this case, it is natural to ask who eats the most red meat cold cuts.
However, the fact that those who eat the most red meat cold cuts also eat less fruits, vegetables, etc., and are less educated, and so on, should raise profound questions about what is the real cause of the bladder cancer! Did the researchers rule out lower consumption of fruits and vegetables (say) as a cause for increased bladder-cancer risk? (The subject of causal models is covered in Chapter 2.)
(Also, we should not tolerate the sloppiness in the language used in this (and similar) article: every time they mention "red meat", we should be reading "red meat cold cuts", and they are not the same things!)
Or, the treachery of objective journalism
Those were my thoughts as I made my way through the New York Times article about today's jobs report. (Not picking on the Times in particular; this is a disease of the so-called "objectivity" in journalism.)
The article keeps stating the "facts". For example:
Employers added 431,000 nonfarm jobs nationwide in May, the biggest increase in a single month in a decade, the Labor Department said Friday. (my emphasis)
This sounds objective -- it is a "true" statement that this number is the "biggest increase in a decade". But it is also a "lie" because this is also the only month in the past decade for which the Census took on hundreds of thousands of temporary workers. The article also stated this additional "fact":
But the bulk of the growth was in government jobs, driven by hiring for the 2010 census, and private-sector job growth was weak.
A few paragraphs later, the article stated:
Altogether, 411,000 of the jobs added were for census workers whose positions will disappear after the summer.
Here lies the rub. We cannot argue with this paragraph because all statements are true as stated but this is not what I consider "good reporting". This reporting misrepresents "what the data means".
Here's one version that calls a spade a spade:
Employers added 20,000 nonfarm jobs nationwide in May, the Labor Department said Friday. In addition, 411,000 jobs were added for temporary workers for the once-in-ten-year Census, positions which will disappear after the summer.
In data reporting and analysis, it is very dangerous to be "objective" in the sense of "just stating the numbers". This is a very important point to realize. You want your data analysts to interpret and understand what the data means. The fact that this increase was "the biggest in a decade" has zero value, none whatsoever! To mention this in the first sentence of this report is a travesty, an embarrassment.
Many economics blogs are vigilant about reporting of economic statistics in the media. I recommend Dean Baker's Beat the Press, and Calculated Risk even had a preemptive strike about the reporting of this jobs announcement (CR has analysis here).
PS. My related post about how to read these type of data.
The last section of Chapter 5 may feel a little out of place: after examining the ins and outs of statistical testing, using flight safety and lotteries as a backdrop, I tagged on a coda on the laudable but poorly-executed attempt by the government to make flight safety data available to the public. I remarked: "a few well-chosen numbers paint a far richer picture than hundreds of thousands of disorganized data." (p.154)
At the time, I debated whether to drop this section because it has little to do with the key concept of the chapter (statistical testing). In hindsight, the decision to leave it intact is wise. An exact parallel has developed in the case of the Fed making credit card terms and conditions available to the public.
As reported in the New York Times, the Fed merely dug a hole in the ground, and filled it with piles of PDF files. They provide a simple search engine so if you know what you want to know, you may be in luck; and if you want to understand the big picture, you are on your own.
Lest you think this interface was designed for experts only, the left margin proclaimed this the "Consumer's Guide".
Maybe Ed Tufte will get around to fixing this (and a myriad other government databases).
On to political culture. I found two headlines for the NYT article, one more favorable to the administration and one more descriptive of reality. I leave you to decide. The paper version I looked at has the headline shown on the right.
Reference: "Credit Card Database is Heroic and Mystifying", Sewell Chan and Andrew Martin, New York Times, May 24 2010.