In talking to people about the invasive and pervasive data collection by businesses and governments these days, I often encounter the following view points:
- Businesspeople who believe in the power of data believe that the data collection is for the benefit of their customers and humankind. For example, the carmakers think that the data will save lives - and at the minimum, the data help drivers manage their car's performance.
- Many consumers say that they don't care about their data being collected, they have "nothing to hide", they are powerless anyway so they accept the situation.
If those views hold, then the paradox of data collection is: why is there a need to hide data collection practices? Why is consent hidden and coerced? Why are sensors hidden and silent?
***
If those views hold, then the following practice seems possible: businesses disclose clearly how and what data are collected, consumers are given a choice of whether they want their data to be collected or not, those consumers who don't care will sign the consent and benefit from the data collection; those who care about their privacy do not sign the consent, will continue to be served, but will not benefit from the data collection.
So what is the problem?
You probably already saw this interesting work from last year. Another demonstration of how some sensitive information can be teased out of "confidential" data.
http://toddwschneider.com/posts/a-tale-of-twenty-two-million-citi-bikes-analyzing-the-nyc-bike-share-system/
Posted by: Dave C. | 01/31/2018 at 11:30 AM
DC: Thanks for the link. I like that part of the analysis on uniqueness of the trips. He's saying age, gender, whether Citibike subscription is annual, trip starting location, and date-hour of pickup identifies a trip 84% of the time. This is a bit alarmist because he's talking about trips not riders. But it is absolutely true that public data have unintended risks.
Posted by: Kaiser | 02/01/2018 at 02:51 AM