Two cautionary tales appeared in press recently, serving notice to all "data scientists" (as statisticians are fancifully called these days). It's hard work to earn the status of a "science".
Via the New York Times comes the story of Dr. Robert Spitzer (link). As a young psychiatrist in the 1970s, he successfully pushed the profession to narrow the definition of homosexuality as a disorder. He observed wrily that many gay people are happy, and therefore only those who are depressed should be diagnosed.
In 2001, he presented new findings that show that homosexuality can be "cured" by reparative therapy. This was his method:
He recruited 200 men and women, from the centers that were performing the therapy, including Exodus International, based in Florida, and Narth. He interviewed each in depth over the phone, asking about their sexual urges, feelings and behaviors before and after having the therapy, rating the answers on a scale.
He then compared the scores on this questionnaire, before and after therapy. “The majority of participants gave reports of change from a predominantly or exclusively homosexual orientation before therapy to a predominantly or exclusively heterosexual orientation in the past year,” his paper concluded.
He strenuously defended the study for years, after it got published in his friend's journal without going through the typical peer-review process. (The article was published with commentaries by peers, which according to the NYT, were "merciless".)
At 80, he is coming forward to apologize and retract the study. Bravo to him for doing this. But one wonders how the industry of science failed to expose this failing much sooner. Is it because of the stature of the researcher? Is it conformance? Is it because he circumvented the usual peer-review process? ...
The reporter said the biggest problem with the study was self-interested subjects lying about sensitive issues like these. Actually, no. The biggest problem is the absence of a control group - gay men and women who did not receive such therapy. It boggles my mind that a study done in 2001 would have only cases and no controls. The case-control methodology has been in use since the 1950s/60s.
***
If you think that was bad, hold your nose before you read this Wall Street Journal article about cancer studies (link).
Here is a sample of the stinky sentences (my italics in all cases):
After publishing a paper on a rare head-and-neck cancer, [Dr. Mandic] learned the cells he had been studying were instead cervical cancer...
Dr. Mandic entered a largely secret fellowship of scientists whose work has been undermined by the contamination and misidentification of cancer cell lines...
Cell repositories in the U.S., U.K., Germany and Japan have estimated that 18% to 36% of cancer cell lines are incorrectly identified.
Dr. Tarin has spent 25 years working with that cell line--or so he thinks. A body of research suggests that MDA-MB-435 isn't breast cancer; many scientists now believe...[it's] melanoma... Dr. Tarin disagrees.
The prevailing attitude [among scientists] is that the other lab's cell line may be contaminated but not mine.
Nearly 40 years later, ... found 1,000 citations of the same contaminated cancer lines revealed in Dr. Gartler's 1966 findings, which have since been replicated many times using more advanced techniques. "They [the scientists] are either crooks or stupid."
As data scientists like to say, "garbage in, garbage out". But who among us is courageous enough to voluntarily consign decades of our own research to the dustbin?


I see your point about the lack of controls in the "reparative therapy" here, but I think the reporter is making a very good point. With no controls, we have no basis for making the inference that the treatment caused change. The problem though, is with only retrospective reports from self-interested parties, we have a very, very limited basis for inferring that there was change. Unfortunately, the single-group-posttest-only is about the most common design with "applied" interventions, including job training. For what its worth.
For my money, not being able to be sure that there is an effect to attribute to some cause trumps being able to pin the effect to a particular mechanism, in any particular case.
Posted by: Sethspain.wordpress.com | 05/23/2012 at 08:15 AM
Thank you for another poignant example of why method matters ... my statistics students get sick of me harping on correct method (even though controls are difficult to apply in complex situations, such a basic idea) but if (when?) my students get that part time lab job carrying out or processing results from someone else's experiment, I want them to RUN from meaningless data! Note that it is Benjamin Franklin who was credited with first using control methodology (http://www.stephanaschwartz.com/wp-content/uploads/2010/03/BF-Scientific-Testing.pdf)
Posted by: Scott Hagin | 05/26/2012 at 08:17 AM
Now and then a patient will come to me wondering if an odd spot on their face or neck is cancerous. Nine times out of ten it's just an age spot or a big freckle that has formed from being out in the sun too much without protection.
Posted by: Intrakid | 09/13/2012 at 03:02 AM