Joel SteinKlein covers at length the silent tracking of on-line behavior, and linking of the collected data to individuals by businesses in a Time magazine piece. I highly recommend it.
The first page is consistently amusing, as Stein describes the different profiles various companies have compiled about him. For example:
Google's Ads Preferences believes I'm a guy interested in politics, Asian food, perfume, celebrity gossip, animated movies and crime but who doesn't care about "books & literature" or "people & society." (So not true.) Yahoo! has me down as a 36-to-45-year-old male who uses a Mac computer and likes hockey, rap, rock, parenting, recipes, clothes and beauty products; it also thinks I live in New York, even though I moved to Los Angeles more than six years ago. Alliance Data, an enormous data-marketing firm in Texas, knows that I'm a 39-year-old college-educated Jewish male who takes in at least $125,000 a year, makes most of his purchases online and spends an average of only $25 per item. Specifically, it knows that on Jan. 24, 2004, I spent $46 on "low-ticket gifts and merchandise" and that on Oct. 10, 2010, I spent $180 on intimate apparel. It knows about more than 100 purchases in between. Alliance also knows I owe $854,000 on a house built in 1939 that — get this — it thinks has stucco walls. They're mostly wood siding with a little stucco on the bottom! Idiots.
These businesses now admit that they build individual profiles, which means that they are not just analyzing average behavior, not just looking at anonymous user behavior (for example, replacing names with numbers as tracking identities), but they are linking behavioral data to real names, addresses, phone numbers and more.
After some tortuous reasoning over 5 pages, Stein concludes:
the more I learned about data mining, the less concerned I was. Sure, I was surprised that all these companies are actually keeping permanent files on me. But I don't think they will do anything with them that does me any harm.
***
There are a few statements in Stein's article with which I don't agree in full.
- Data mining does no actual harm.
Stanford Professor Ryan Calo believes that privacy concerns are overblown because "data mining does no actual damage". And, that's because "the real problem with data mining arises when the data is wrong". I read this to mean that Calo (and others in this camp) believes that data mining does "actual damage" when the data is wrong. News for them: data can never be accurate, it is always wrong.
- The problem of data errors is solved by letting consumers inspect and edit their own profiles.
Allowing individuals to access and "correct" their data is not a real solution. "Correcting" the data will render the data even less informative (more bias) as we can create "personas" to fool the algorithms. Say, the profile marks you as a BT (bad tipper), and indeed you are a bad tipper but nobody would want to be labeled as such. I discussed this issue in the context of "correcting" credit scores in Chapter 2 of Numbers Rule Your World. (More on BT here.)
- Data mining for ad targeting is a low-stakes activity.
Stein is right that the stakes are low -- the cost of showing poorly-targeted ads is indeed low. However, precisely because false positives are cheap, designers of data-mining algorithms will tune them to tolerate more such errors (in order to reduce false negatives, which cause lost sales opportunities). This means the data is guaranteed to be inaccurate, which undermines Stein's belief of no harm done. (See Chapter 4 of Numbers Rule Your World for more on the trade-off between types of errors.)
- Companies are not doing anything harmful.
"Are not" does not imply "will not" unless we have "cannot".
- Asking consumers for prior consent solves the problem of secretive tracking.
This type of solution is both deceitful and impractical. Anyone who has been asked to agree to the 50-page Terms and Conditions for the iPhone (or similar) understands that one is not supposed to read the agreement, just to categorically accept it. Besides, it is difficult, if not impossible, to document exactly how each employee of a business is using the data. This gets even more complex if the business sells the data to third-parties, and/or exchanges the data with other data collectors.
***
The CEO of Bizo, one of the companies that build profiles, claims that "not many people seem to be creeped out by all the junk mail they still get from direct-marketing campaigns, which buy the same information from data-mining companies". He clearly does not understand the other side of the argument.
The differences between traditional "junk mail" and today's "targeted ads" are speed and scale. Direct mail campaigns typically take at least 2 weeks to plan and execute (and that is extremely fast) while an on-line targeted ad appears possibly within seconds of someone visiting a web page. Needless to say, the creepiness factor is multiples higher. Stein has a nice anecdote about Zappos's experience.
The other key difference is scale. The extent to which data is being collected, exchanged, aggregated, analyzed today is also multiples that of yesterday.
***
If, indeed consumers do not care, as Stein seems to believe, then tracking should always be opt-in. Today, it is with few exceptions opt-out, and in many cases, it is undisclosed.
As I mentioned when launching the "Know your data" series of posts, I think these types of technology have both benefits and problems, and I don't oppose all such efforts. After all, I am a part of the direct marketing industry. Nevertheless, I think some of the current data-collection activities are too intrusive in addition to being misguided. I'll explain what this means in a future post.


joel *stein*
Posted by: | 03/18/2011 at 03:26 AM
Thanks anonymous reader. Hope I fixed them all.
Posted by: Kaiser | 03/20/2011 at 11:13 PM