The recent articles proclaiming the death of organic foods could as well be written by big agribusinesses. The reporting raises a whole bunch of questions. Here's the NYT on the study (link) and here's Time (link).
Here are some points to ponder:
1. There is no new research. Time's headline gets this wrong even though in the article itself, the author says the research is a review of existing studies (so-called meta-analysis) but on this point, the NYT is clear, the study is an "extensive examination of four decades of research".
2. Despite the headline claim that organic food is no more nutritious and no less contaminated than non-organic food, both articles devote the majority of words to reporting the statistics that contradict the claim. We learn that organic food has less pesticide residue, less bacteria, more omega 3 fatty acids, more phosphorous, more phenols, etc.
3. The researchers then tell us why none of those negative findings can be trusted. This runs the gamut from the difference is not practically significant because the levels are too low to matter, or because the levels are so high it doesn't matter, or in the case of bacteria, because bacteria will be killed during cooking.
4. A particularly shocking claim is the one cited by NYT to dismiss the more phenols finding: 'While the difference was statistically significant, the size of the difference varied widely from study to study, and the data was based on the testing of small numbers of samples. “I interpret that result with caution,” Dr. Bravata said.' Statistical significance was invented for one purpose, which is to deal with sampling variability, and sample size. It makes little sense to blame those two factors after testing for significance, otherwise they shouldn't bother testing for significance. In addition, since this is a meta-analysis, they could easily have thrown out studies in which the sample size is considered too small, or the sampling variability is too high -- doing so upfront will reduce the suspicion of cherrypicking.
5. In fact, these researchers seem to already have hit the tip of an iceberg. At the end of the NYT article, they admitted that they erroneously omitted a study that showed organic food had a benefit. Given that the selection of which studies to include in a meta-analysis is such a crucial step, it does not help their credibility to have committed this error.
6. As far as I can tell, none (or almost none) of these studies are randomized controlled trials (RCT) in which the health benefits of organic foods and non-organic foods are assessed. This is quite surprising given the assertion that the two groups of foods are the same. Given that it is likely that no one would die from such a study, there just isn't any reason why we should rely on observational studies. In fact, according to Time, only 17 of the over 200 studies included in the meta-analysis measured "health outcomes"; all the other studies compared "nutritional content".
7. The bulk of both articles accepts the underlying logic that consumers buy organic food because of better nutrition or lower contamination. I know many people who buy organic food to support small businesses and others would prefer to know that the chicken they are eating were able to run around rather than get cooped up inhumanely. This is a fundamental flaw of a study that is being interpreted as a reason not to buy organic food. The NYT reader would have to read all the way to the end of the article to see this point raised.
***
For those nutritional scientists out there, do you know why RCTs aren't popular in your field? Almost everything we read from the newspapers report from longitudinal observational studies that really do not seem robust enough, resulting in all kinds of contradictionary conclusions, and embarrassing reversals.
As to point 4, as I recal from every statistics course I ever took, all you need to achieve statistical significance is a sufficiently large N. It was always stressed that practical significance was up to the researcher to decide based on considerations other than the P-value. If organic consistently has half as much pesticide contamination, but they are both measured at some number below the threshold considered a safety risk then they are both equally safe because neither of them pose a risk to health. Where is the problem with this sort of interpretation?
Posted by: Joshua | 09/13/2012 at 09:31 PM
I expect the reasons for not running RCT is that they cost a lot, can take a long time to return results and the results may still be inconclusive. A food study is going to be complicated by people having frozen foods or takeaways or eating out, at either commercial or friends. Plus people just decide that they don't want to eat what they are supposed to. Happens a lot in studies of diet on cholesterol.
Posted by: Ken | 09/14/2012 at 02:43 AM
Joshua: Practical significance is really important but like everything in statistics, there is a fine line between use and abuse. In the case you described, the proper conclusion (if we accept the interpretation) is that both food groups satisfy the safety standard. It isn't right to conclude that both food groups are the same because the statistical test showed they had different levels of bacteria. It always should arouse suspicion when the researcher essentially changed the judging criteria after seeing the results.
From the researcher's perspective, this is also dangerous territory. It's very easy to lapse into finding stories that fit any conclusion, which explains away the unwanted cases.
Posted by: Kaiser | 09/14/2012 at 09:35 AM
I fail to see how previously determined safe upper limits of exposure are "chang[ing] the judging criteria after seeing the results." The threshold for practical significance was determine experimentally, presumably by a different group, and in advance of the trial that detected statistically significant differences below the level of practical significance. How would you prefer that practical significance be determined in this case?
I suspect that Ken is correct about the RCT. Anecdotally, a professor of mine related how she participated in a 6 month dietary comparison study, and most of her peers flat out lied about their eating habits because they got tired of following the prescribed nutritional regimen (and they were being paid to participate). That's one of the perks of being an animal researcher. My animals eat exactly what I give them, and nothing else.
As to point 7, the original impetus of the organic movement was about food safety concerns and health concerns. That many conflate organic with free range or local is actually a flawed connection. Organic does not require free-range, nor is it necessarily local. Organic fruits not native to the shoppers geographical region are by definition 'not local' regardless of their status as organic. In California the companies that grow most of the available organic food and those that grow most of the non-organic food are the same companies. They simply target part of their production at different markets.
Posted by: Joshua | 09/14/2012 at 10:00 AM
Joshua: If the safety limit is the judging criterion, the null hypothesis should be bacteria from organic food <= threshold, not bacteria from organic food = bacteria from non-organic food. A test is only as good as how you conceptualize it. To set it up in the latter way and then interpret it in the former way is changing the metric after the fact.
If safety limit is the criterion, then every single test in this study should adopt this setup but it doesn't sound like that is the case here.
Posted by: Kaiser | 09/14/2012 at 12:33 PM
There are two different goals here.
First is to characterize any differences between the results of the two different production strategies. This goes to the largely data-less argument that one is "Better" than the other. Based on this review, Organic food has less pesticide residue than conventionally produced food. That is an important question to understanding the downstream effects of production changes, regardless of what the safe exposure level is.
The second, and to my mind independent, question is whether each is "Safe". People are only concerned about pesticide residues because of safety concerns. The EPA is responsible for determining safe exposure levels and making recommendations as to the safe upper limit of exposure. Whether something possess unsafe concentrations of a specific pesticide residue is determined by comparing an analyzed value to a table value. To my knowledge there is not statistical test to determine if the sample concentration is statistically different from a table value. Maybe I'm wrong and you can correct me here.
This study answered question one. The authors then compared the values to the threshold and found them, regardless of source, to be below the threshold. If the EPA receives evidence that they were not conservative enough in their safety limit and revises it downward, then the difference may turn out to be practically significant. In which case it is good that the researchers presented this difference even if it is not of practical significance at the moment.
Posted by: Joshua | 09/14/2012 at 01:30 PM
Cost is one reason not to do an RCT, compliance or lack thereof is another. I imagine that cumulative effects over longish periods of time are of the most interest here and that exacerbates both. I can't imagine a study design that wouldn't be susceptible to huge noncompliance over long periods and measuring that would be tough. Did you have a design in mind?
Posted by: jared | 09/16/2012 at 09:57 PM
Ken and Jared: Cost and non-compliance are always there for any kind of RCT so I wouldn't consider that as a valid reason for not running RCT. While RCT results may be less than perfect because of noncompliance, as you rightly pointed out, I'd still trust that result more than any number of observational studies with convenience samples.
One possibility is to start with animal studies, looking at animals that have short generational cycles. I'd also trust such studies if they are RCTs more than any number of observational studies with convenience samples.
Can anyone trust an observational study with convenience samples that purports to measure long-term cumulative effects when you have no control over anything during the period of "observation"?
Posted by: Kaiser | 09/21/2012 at 12:26 AM