On Friday, Facebook announced that hackers have gained access to personal data of at least 50 million users (see here for example). Analysts immediately connected this incident to the Cambridge Analytica scandal. How is this data breach different from the Cambridge scandal?
How the data was breached
One should take any announcement of how a data breach occurred with a grain of salt: it's obvious that companies will not publish anything that attracts lawsuits; plus, it's unclear that there is a penalty for lying about the real reasons for a data breach. There is no external auditing, and the companies control the narratives and statistics surrounding these events.
Based on what Facebook told us, the Cambridge Analytica scandal involves Facebook partners gaming the system to obtain data about Facebook users. At worst, it is a violation of community standards, i.e. claiming to be doing academic research. The research firm uses tools provided by Facebook to obtain the data. (The firm maintains that it disclosed to users that it was using the data for non-research purposes.)
In the present case, Facebook claims that a combination of coding bugs (i.e. unintended features) enabled unknown partners to gain access to the user's entire account. Not even that, but any accounts on third-party apps or websites that the user signs on to using Facebook.
How the breach was discovered
The current scandal demonstrates the value of the business intelligence/business analytics functions within companies. We were told that Facebook first realized that certain metrics were showing unusual trends, and upon investigation, they discovered the bugs.
This is entirely believable. That's what happens when you have good data reports. They surface anomalies. These then have to be investigated. These investigations are extremely tricky because all you know is the trends are different. There are a thousand reasons for the shift. The analyst's job is to establish a cause-effect. Especially since the development community adopted "agile" practices, all kinds of self-imposed changes are occurring all the time with no warnings. For a site as large and complex as Facebook, it takes a huge effort just to get a list of all site changes within some specified time window!
What complicates this situation more is that the vulnerability was traced to multiple bugs, not just one. I could imagine the twists and turns and the false alarms that were generated during the investigation.
The nature of this problem is no different from an investigator trying to chase down an e-coli outbreak, which is detailed in Chapter 2 of Numbers Rule Your World (link).
The data science community is guilty of talking down on the business intelligence function. There is a misperception that BI is for less skilled people doing boring things. The reality is there is more science in BI than in so-called data science (defined here as software engineering). Science, after all, is about figuring out why things are as they are. Engineers, by contrast, use our understanding of science to change the way things are.
You've been warned
In my debut Youtube video, released a few weeks ago, I explain how Facebook collects data about you. My biggest tip about protecting yourself is to not use Facebook to log onto other websites. Convenience is the drug that lures you into the trap.
Think about those one-password services. Instead of hundreds of password, you have one. So the thief needs only one password to assess hundreds of sites. Using Facebook to log onto other websites makes Facebook the centralized point of attack by the bad players.
Note that this is not limited to Facebook. Using Google, Yahoo, Amazon, etc. to log onto other services is no different!
If it's more convenient for you, it's also more convenient for your enemies. Remember that.
Also in the video, you will learn
- why did Facebook invent the Like button
- how do Facebook know what you are doing outside Facebook
- why not clicking doesn't mean you're not being tracked
- why you shouldn't use Facebook to log into other services
See the video here, and send me ideas for future episodes!