One of the most misguided and dangerous ideas floated around by a group of Big Data enthusiasts is the notion that it is not important to understand why something happens, just because "we have a boatload of data". This is one of the central arguments in the bestseller Big Data, and it reached the mainstream much earlier when Chris Anderson, then chief editor of Wired, published his flamboyantly-titled op-ed proclaiming the "End of Theory."
In making the claim that causal analysis is hopeless and pointless, these proponents are disowning entire fields of study. While focusing their vitrol on the social sciences, they somehow miss the obvious: that causal thinking is and has always been the foundation of the physical sciences and engineering. Even business executives understand the primacy of "root-cause analysis."
Contrary to these folks, I believe social scientists will produce the most exciting research on causal analysis. Human behavior is generally much more variable than natural phenomena, complicating the search for causes. We need even smarter people to tackle these problems.
That said, a lot of published social science exhibit flawed thinking about causes. Andrew and I describe a few of the problems in the latest Statbusters (link). An assumption is often made that an observed effect is due to a single cause. Much effort is expensed on identifying this one cause from a slate of candidates. Further, it is just as important for us to know which studies failed but such failure is never reported in journals or the media. This publication bias results in researchers examining the same correlations over and over again, and eventually one research group will discover a "statistically significant" effect and get it published, even though in reality, the totality of the evidence would contradict the one published study.
Looking at the brain imaging studies,there are a lot of 20 subject studies and it seems to be a standard that originates from when they were extremely expensive and even the best researchers could afford that number. Now it seems to be the number that someone with minimal funding can do. Then they apply a lot of different outcomes, looking at different parts of the brain, and different statistics. Amazingly, something always seems to be significant.
Unfortunately while the journals accept that then this poor research will be published. It would be better to run larger studies, but there is now an expectation of the number of papers that a researcher authors.
Posted by: Ken | 07/13/2015 at 06:40 PM
Regarding publication bias, are you aware of any venues which allow researchers to share these "non events" to help prune the search space? As it exists, it seems like the process starts with an "interesting" finding which gets published and then (maybe) gets replicated to demonstrate the experiment can be reproduced or not.
Even assuming there was some outlet that allowed "failed" experiments to be shared, do you think there's a bias among researchers to p-hack or whatever in order to publish something more often than not - publish and perish and all that...
Posted by: Adam Schwartz | 07/15/2015 at 09:05 AM
@Adam There is a discussion on outlets for studies at https://www.researchgate.net/post/Who_knows_about_journals_preferably_publishing_negative_results_from_pharmacological_clinical_trials
An obvious one is http://www.jnrbm.com
There has been research that shows that well-designed trials tend to be published irrespective of what the result is. Telling the world that a drug doesn't reduce blood pressure and giving a reasonably tight confidence interval is worthwhile. I suspect it is more the drug companies that don't want that published who are the problem.
Posted by: Ken | 07/16/2015 at 01:21 AM
@Ken, thanks for the leads!
Posted by: Adam Schwartz | 07/20/2015 at 08:42 AM