For my third post about the Facebook data scandal, I’m proposing solutions.
Following from my last post about the paradox of data collection, I urge that we don’t throw the baby out with the bathwater. Data collection and analyses can be beneficial to Facebook users. But we also know that the same datasets can be used to advance business goals at the expense of user interests. So solutions have to balance the two sides.
If the data collection is solely or mostly for the users’ benefits, as businesses assert, then it can be conducted in the open, and amenable to all parties.
In the face of the Facebook scandal, it’s time the industry adopted the following principles of responsible data collection:
1 - Opt Ins not Opt Outs
Currently, for most websites and mobile apps, the default is maximum data collection. Users wanting privacy then figure out how to limit the amount or type of data collected about them. This is an example of opt-out. The default should instead be opt-in: no data collection unless instructed by users.
When the default setting is opt-in, businesses have to win over the users’ trust, and so they will have a much stronger incentive to clarify and explain the benefits of the data collection. Say goodbye to the days of hand-waving claims, coercion and trickery.
Some website or app developers argue that they require certain data to better the user experience. For example, if you are searching for a local restaurant, the search engine can provide more relevant results if it knows your current location. However, the app or website developer will keep track of your location, possibly forever, and possibly for sale to third parties, and so some users may elect to suffer inconvenience to gain privacy. For these users, a few more clicks to zoom in to the right neighborhood is not a major hassle. They feel that the more private, less convenient experience is better than the less private, more convenient experience!
2 - First-person not second- or third-person permission
When you create a new Facebook account, you are asked if you’d like to upload a contact list. If you choose not to, Facebook will still have lots of suggested friends for you. How does Facebook know who you know? One source of data is your friends. If your friends agree to upload their contact lists to Facebook, and your name or email or phone number happens to be on those lists, then by a reverse lookup, Facebook knows who your friends are. Such predictions are highly accurate. By uploading their contact lists, your friends have shared your private data without asking your permission – worse, they have given Facebook permission by proxy to take your private data and profit from it.
A similar process happened with Alesandr Kogan’s app. The Mechanical Turks took less than one dollar to complete a survey, and as part of that process, they permitted Kogan to take their Facebook data, and knowingly or not (probably the latter), also their friends’ data. These Turks never asked their friends if they can share their data. But that is also beside the point. If Kogan wanted their friends’ data, Kogan needed to seek permission directly from those people. Permission by proxy is dishonest, and should be banned.
This policy imposes serious limitation on trading of user data. Either Facebook or the third party must obtain permission from users before the data transfer takes place. Users are demanding a right that has been afforded advertisers. Just as advertisers do not want to be associated with certain websites or videos, users also do not want their information disclosed to third parties with whom they don’t want to be associated.
3 - Stop mis-direction
I’d like to see strong regulation with heavy penalties for businesses that request permission from users for specific uses of their data but then fail to police their data analysts to curb abuses. For example, many websites collect mobile numbers from users, saying that the two-factor authentication is essential to protect their accounts. Once the phone numbers are stored in the database, there is no telling which data analysts will get a hold of the data. Most data analysts will utilize whatever data they can get their hands on.
To prevent mis-direction of data, companies should have a data governance function.
4 - Sunshine Policy
It is technically feasible for Facebook or other companies to keep a log of which third parties have received what data about you from Facebook. If these companies believe that the trading of private data is fundamental to their business models, then they should allow users to inspect how they collected the data, and which entities received the data. Better yet, users should be given the ability to opt out of specific transactions. For example, if Facebook has a deal to sell data to Pfizer, users should have the right to say no, you should not give our data to Pfizer.
5 - Wall off the data
If companies are willing to wall off user data, and not send them to third parties, then users are more likely to share the data. Years ago, I gave Amazon a lot of data, then when I learned that Amazon purchased a company that is in the business of selling such data to advertisers, I stopped. The incremental benefit of “better” recommendations is just not worth the potential harm done by the data flowing to unknown third parties.
6 - The right to be forgotten
Europe is ahead of the U.S. on this issue. Companies should be required to delete user data older than say five years. Aggregate statistics older than five years should be allowed. More recent data supersede the older data, so there is negligible value in keeping the old data anyway.
There are two types of data: temporal data, which do not have a fixed value; and immutable data. The latter is stuff like your ethnicity, your social security number, and for many people, their permanent home address. An example of temporal data is your favorite book or movie. The Equifax data breach is extremely hurtful because they lost a lot of immutable data (of note, other than some noisy hyperventilating when the story broke, the silence of our politicians is deafening.). Bad actors only need to get their hands on that data once.
The right to be forgotten reduces the number of copies of your immutable data in existence and thus reduces the chance that they get stolen. The right to be forgotten removes unreliable and outdated temporal data from the databases, benefitting you in myriad ways, e.g. you will no longer see every non-functional work email of relatives stretching back decades.
7 - Stop the blackmail
One reason for the pervasive data sleaze is the favorite business model of web and mobile companies – free service to all, paid for by advertisers. Users are then barred from using the service unless they sign off on extensive snooping. Sometimes, their signatures are not even required; the websites just claim that usage is taken to imply consent. This policy is about taking the cake and eating it too. The website operators don’t really want to ban any user so as to inflate their user counts (“eyeballs”).
This practice creates the perception of dishonesty, and is self-defeating, if the companies actually believe that the data collection benefits their users. If the business model is such that users get free service in exchange for their private data, then they should enforce strict access policies, only serving those who acknowledge the data collection.
***
Facebook is the center of attention; rather undeservedly because all social-media companies engage in similar practices. The regulators are absent from the scene, so there has been no pressure on them to tighten up privacy policies.
Now is the opportunity to find a solution that works for all sides.
Let me know what you think about these principles.
P.S.[3/28/2018] Bloomberg just reported that Facebook announces new privacy policies. A day late, and a dollar short. It's disappointing that for Facebook it is business as usual. The concept continues to be: let us collect maximal amounts of data, and each user must individually find the settings to delete individual items from the trove of data. People on twitter are finding that in some cases, Google has over the years collected over 5 gigs of data on a single user - how much time do you have to go review each element and ask to delete? In addition, the same way that Facebook cannot police Cambridge Analytica on whether or not they deleted the data, Facebook users cannot police Facebook on whether they deleted the data. Twitter users are reporting that they are finding deleted files and other deleted content in Google data. See here for an example of what people are saying on Twitter.
Recent Comments