I am traveling so have to make this brief. I will likely come back to these stories in the future to give a longer version of these comments.
I want to react to two news items that came out in the past couple of days.
First, Ben Stiller said that prostate cancer screening (the infamous PSA test) "saved his life". (link) So he is out there singing the praises of the PSA test, which has been disavowed even by its inventor (link), although still routinely used by many physicians.
One can't dispute that the PSA test result caused Ben Stiller to know about his cancer and he is better today because of that discovery.
However, imagine the following scenario: I invent my own screening test. The test consists of flipping a coin: heads, you have cancer, tails you don't. Amongst those people who came up heads, I can find one for whom he truly has cancer. I saved his life because my test alerted him to this fact. Because I saved this person's life, my test must be really good. (If one anecdote is too few, I could find a handful of people whose lives I have saved.)
Second, the FBI tells reporters that the Minnesota mall attacked "withdrew from friends in months before attack." (link)
Imagine that you are trying to predict who will be the next disgruntled attacker. Based on the FBI statement, you want to round up everyone who "withdrew from friends." How many people would that include? How many of them will eventually be attackers?
Same holds for all the other findings, such as "he converted to Islam recently", and "he posted something hateful on Facebook".
It is precisely when we want something badly, like information that saves our lives or that prevents terrorist attacks, when we become most susceptible to nonsense data analyses.