I mentioned the Harvard Business Review article on business use of customer data in the "Big Data" era. In the previous post, I looked at the nature of the evidence used by the authors. In this post, ignoring my discomfort with some of the evidence, I examine the conclusions of the article.
The report has a three-part structure: the first section describes the issues; the second section communicates results from a few surveys conducted by frog - a global strategy and design agency - on various issues related to data privacy; and the third section presents examples of their recommendations for clients, which they offer generally to businesses involved in collecting and monetizing customer data.
The survey results are revealing (although the sample size of 900 in five countries is tiny so I'm not sure you should believe them). The agency found that 97% of the people surveyed are concerned about businesses and governments mis-using their data. Seventy-two percent of Americans are reluctant to share information with businesses because they "just want to maintain their privacy".
The authors also learned that consumers have grossly under-estimated the extent of data collection. Only 25% of the respondents said they knew businesses tracked their location, and only 14% said they knew businesses shared their web-surfing history. Finally, their analysts attributed dollar value to the privacy of different types of data.
I follow them up to this point. In fact, the authors summed it up very nicely at the beginning of the article: most [companies] prefer to keep consumers in the dark, choose control over sharing, and ask for forgiveness rather than permission.
Unfortunately, I am let down by the list of recommendations that follow. They feel to me like tweaks on failed ideas, rather than paradigm shifts.
The first recommendation is "educate the consumers". The authors gave an example of one of their own consulting clients who required "customers" to watch a video and give preliminary consent before sharing their own (genomic) data. And the personal data is withheld until the "customer" returns a hard-copy agreement.
We don't need to be reminded that every day, we "voluntarily" sign Terms and Conditions which no ordinary person actually reads. Frequently, we are told not to use a website if we don't agree with any part of a lengthy agreement written in one-sided language favoring the business.
The "new" solution doesn't change the status quo. In fact, it gives businesses a stronger case for arguing that their users have voluntarily given up the right to their own data. In my view, until businesses confront the issue of properly disclosing how they collect data, what information is being collected, and how such data are being sold or traded, consumers will continue to find such practices creepy.
The second recommendation looks good on paper but is impractical. Another one of frog's client is featured here. This client allows customers to specify which pieces of data can go to whom.
Assume there are 100 variables (only!) being collected and five levels of access control. That amounts to 500 yes/no questions each user is required to answer in order to gain full control of the data. In practice, most users will decide not to bother because it is too complex and time-consuming. The solution is a form of suffocation by paperwork.
For the data analysts, such a solution creates headaches. It generates self-selected data of the worst kind. Each variable has its own source of bias as different subsets of users decide to withhold their data for their own reasons.
To implement such a system properly requires a herculean effort. Say I reviewed the list of 100 variables and divided them into five groups of 20 variables using the five levels of control (from allowing anyone to see my gender to hiding my age from everyone). Two months later, I changed my mind. I removed access to 80 of the 100 variables from everyone. Now, the database administrator should find all instances of those 60 variables and delete them. Some of the data may already have been sold to other entities, and what if those other entities re-sell my data after I asked for the data to be deleted by the original source?
The last recommendation is an argument that businesses should not need to pay users for their data. Given the finding in the second section that users assign meaningful dollar values to their data, this seems to be a solution for businesses rather than for consumers.
Pandora's free advertising-supported service is used as an example of customers' willingness to exchange their privacy for "in-kind value". The article failed to mention just how much money Pandora has been paying for such data! As this other HBR article tells us, Pandora is "13 years, 175 million users, little profit". It has never been able to establish a profitable business model because while 80% of its revenues come from advertising to those "free" accounts, 60% of its revenues immediately goes out the door as royalty payments for the "free" music! It's not surprising that many consumers are willingly engaging in this lop-sided exchange with Pandora.
I often wonder if consumers realize that over-sharing their data works to their disadvantage, would they become more interested in how businesses use their data?
For instance, insurance companies will be very interested in acquiring data from personal analytics devices, like Fitbit. They will use the data to predict whether you have health risks, and they will charge you more for insurance. Everyone is at risk for something.
The Uber app gives its users the ability to track their drivers -- in Manhattan, it's like watching a horse-race when your driver tries to negotiate the city gridlock. The same data is used by Uber to get an accurate picture of supply and demand, which drives their surge-pricing algorithms. That's how you end up paying five to ten times the normal cab rate.
Businesses use personal data to reduce information asymmetry, which in the past prevented them from extracting maximum value from consumers.
Today, the data privacy question is phrased as "Company X would like to collect information about your heart rate and in exchange, you will get notified if any irregularity is detected. Are you willing to share such data with Company X?"
Imagine you are asked a different question: "Company X would like to collect information about your heart rate and in exchange, you will get notified if any irregularity is detected. Being notified of heart-rate irregularity may help you but 80% of the warnings will be false alarms. Also, your heart rate data will be used by our insurance arm to adjust your insurance premiums. There is a 50% chance that your premium will increase after sharing your data. Are you willing to share such data with Company X?"