Here is a problem staring many digital/Web/social media analysts in the face today: what if you are told that the majority of the data you have been dutifully reporting, analyzing and (gasp!) modeling are fake data?
By fake data, I mean, useless numbers that have no bearing on reality: visits to websites that never happened, clicks on ads by hired hands, clicks on ads by bots, clicks on ads that are buried layers deep invisible to any humans, video "views" that result from automatically playing clips, video "views" that last one second, ad reach (i.e. number of people who have seen the ad) that exceeds Census counts, reviews planted by hired hands, etc. etc.
Every one of the above is not fictional but the reality of the uncontrolled and unaudited, increasingly machine-driven and complex, secretive world of digital advertising. All major players - Google, Facebook, Microsoft, ad networks like AppNexus and Mediamath - are implicated.
I raised the alarm two years ago in an article at Harvard Business Review, featuring the work of leading ad fraud researcher Dr. Augustine Fou. Recently, there is tidal wave of news reports about all kinds of ad fraud and fake data.
Here are a selected few links to get you started:
Ad Buyers Rate Facebook’s 10 Measurement Errors
Facebook Ads Supposedly Reach More People Than Census Counts
Google and Other Ad Platforms Sold Fake Ads
Google’s Chrome and Microsoft’s Browsers Attract Fraudsters
P&G Cut $100 Million of Digital Ads, Without Impact on Results
Business Could Lose As Much As $16.4 Billion in Online Advertising in 2017
I have invited Dr. Fou to comment on this fast-developing situation in the Principal Analytics Prep Webinar on Wednesday night. Learn more about the Webinar and register for free here.
***
The focus of most news items are from the perspective of brand advertisers who belatedly are waking up to the huge amount of dollars wasted. And a big story is being missed. Such waste was enabled by massive amounts of data that we now know are fake.
What about the zillions of reports, analyses and models created over the last 20 years by countless data "scientists" and analysts, in which the data from Google, Facebook, and myriad digital marketing vendors are taken at face value as accurate?
In fact, the digital advertising industry was built on the promise that it is more measurable, more accountable and more cost-effective. What Dr. Fou shows is that only basic statistics is needed to uncover such fraud.
Data cleaning is a huge time sink already without fake data - now, we have to wrestle with mountains of fake data. But that is the reality, and we have to rise up to it.
It might be cynical but true, that this is something that Google, Facebook, etc knew about but didn't do anything because it didn't reduce their bottom line.
Posted by: Ken | 10/12/2017 at 09:36 AM
Cheers for the read! In my job, I've noticed some tell tale signs of fake data (a heap of traffic lasting for a second with a high bounce rate coming from a specific hosting service or location at a given time). Is there anything more I should be looking out for?
Posted by: Jason Borg | 10/16/2017 at 03:29 AM
JB: I should invite Augustine over to talk about this since he's the expert. But some of the things he mentioned are websites or apps that have machine-generated names, ads that have 100% click through rates, segments that have almost the same proportions (indicating that some random number generation is deployed!). There is also "ghost traffic" but that is a different matter.
Posted by: Kaiser | 10/22/2017 at 12:19 AM