This is a first. I'm agreeing with David Brooks. Sort of.
In his new NYT column titled "Death by Data" (link), Brooks disparaged the recently celebrated practice of using machine learning in electoral politics, such as trying to win elections "Obama-style" by targeting investments on the people most likely to listen to his message, and trying to craft electoral messages by testing and measuring how people react to certain words and phrases.
Electoral politics is another success story often cited by Big Data people.
Brooks said a few things that pinpoint one consequence of how Big Data is being used today. Here are some nice quotes:
"As politics has gotten more scientific, the campaigns have gotten worse, especially for the candidates who overrely on these techniques."
"Data-driven politics is built on a philosophy you might call Impersonalism. This is the belief that what matters in politics is the reaction of populations and not the idiosyncratic judgment, moral character or creativity of individuals."
"Data-driven politics assumes that demography is destiny, that the electorate is ... a collection of demographic slices."
"... it is more important to target your likely supporters than to try to reframe debates or persuade the whole country."
"It puts the spotlight on messaging and takes the spotlight off product: actual policies."
The question in my mind is whether these issues are caused by the data-driven philosophy of the analysts, as Brooks asserts, or by the win-at-all-cost philosophy of the politicians.
By changing the context, these statements also apply to business use of data. A lot of the machine learning models improve the numbers but not necessarily the user experience or customer satisfaction. The prevalence of tricks used to promote unintended clicks of display ads is a powerful reminder of Brooks's "Impersonalism" idea.
I'm preparing my talk next week at the Business Intelligence Innovation Summit in Chicago, which is titled "The Accountability Paradox in Big Data Marketing". More data has not made us more accountable, so far.
***
I'm also unhappy about cookie-cutter campaign speeches that are but a string of buzzwords proven to appeal to the electorate by A/B testing results. But this is made worse by the politicians who are willing to utter these words brainlessly, and by the politicians who are willing to discard their own beliefs in order to win elections, and, also by the electorate who take these politicians at their words.
As Brooks correctly diagnosed, by using OCCAM data (link), particularly observational data without controls, the analysts surface correlations and have nothing to say about causation. This leads to a situation where the models provide little if any information to the politicians about the desires or wishes or expectations of the electorate. All they know is that if they include the word "family" and exclude the word "fear", they may get a higher rating, and if they get higher ratings among persuadable segments of the likely voters, they may win an election.
***
The second half of Brooks' article veers off course, displaying the arbitrary nature of the non-data-driven argument. He parades a list of names of past candidates, and claims that their failures are due to overly relying on data. Obama won the election but lacked any coherent agenda, for example. But there is no evidence to make the connection that subsequent failure is caused by bad use of data.
I last wrote about Brooks here.
Brooks, David - agreement: see clock, stopped, correct twice/day
Posted by: Michael Schettler | 11/06/2014 at 11:14 AM
Non-data driven argument is not necessarily arbitrary. Virtually all well-formed arguments are made through logic based on premises, and are either deductive or inductive through evidence & first principles. Heck, we're too certain of things already - "70.253% chance" of so-and-so being elected. In contrast to a 70.252% chance. Numbers make things sound so much more scientific, but really it just makes us look silly. Law has (usually) understood this, hence "reasonable doubt".
It's right to question brooks on his argument based on the fact he fails to back it up with some type of evidence, whether logical or empirical. But it's not right to say that just because the argument isn't "data-driven" that it is arbitrary.
Posted by: Nate | 11/07/2014 at 09:09 AM
Nate: Good point, as even statistical arguments are logic-based reliant on certain principles. What I wrote implies that I think all non-data-driven arguments are arbitrary. I should have said that some of them are. In the case of Brooks, he offered nothing to support his central argument that those candidates failed because of "death by data".
Posted by: junkcharts | 11/07/2014 at 10:16 AM