Effective analytics require a very simple, but often overlooked ingredient: accurate data.
Digital media provides a much richer store of data than its print counterpart, but rapid innovations in distribution and technology mean that even the most advanced tracking systems can fail to keep up with significant changes.
Last year, for example, the accuracy of Facebook’s referral data was called into question. It seemed that Facebook was often meaningfully underreported when it came to referral traffic. As Facebook continues to serve as one of the highest sources of traffic to our clients’ websites, we wanted to know: Has this been fixed?
First, let’s briefly touch on the phenomenon of “dark traffic” or “dark social.”
Dark traffic encompasses any traffic lacking referral information that tells an analytics provider where the person came from. Parse.ly and other platforms often refer to this as “Direct” traffic. This was traditionally attributed to people typing website names into their browsers. As editorial analytics became more sophisticated, many in the industry noticed that these readers often went straight to articles, not to homepages as had been assumed.
Gradually, it became clear that this traffic came from three main types of sources:
Email and 1-1 Communication
This includes direct emails among friends and, more frequently, email newsletters that have increased in popularity, especially among media outlets. 1-1 communication includes chat applications like Gchat, WhatsApp, and Kik. The Guardian reported on findings that this could account for up to 30% of all sharing.
Other sources that strip referrals on purpose
Moving from an http to an https browser link will do this for security. Features like “Incognito” browsing also accomplish this stripping, along with specific, security-centric services like the search engine DuckDuckGo.
Apps that strip referrals by accident
Alexis Madrigal has made an argument that most dark traffic was in fact coming from Facebook’s app. The main reason: a bug in Facebook’s mobile app stripped out Facebook as a referrer for months. There are other apps that likely do this by accident as well, but since Facebook had grown so much such as a source traffic for publishers over the previous year, this understandably caused a lot of concern.
This problem, though interesting, wasn’t insurmountable. Because it was accidental rather than intentional, this meant that there were other ways to make approximations and fixes.
The teams at The Guardian and BuzzFeed, for example, both commented to say that they tracked this information through User Agents.
@alexismadrigal @mathewi the Guardian’s Ophan has been tracking that IA for over a year (hence the lower dark social number) /cc @tcordrey
— Joost de Valk (@jdevalk) December 4, 2014
For the less technically minded, what this meant was that Facebook was sending additional information in their dark referrals, which most analytics systems typically didn’t use in attribution (including our own).
So last year we took a look at how much traffic was being hidden due to this issue, and found that 11% of page views that we had been categorizing as “direct” were actually from Facebook.
When we did this, we also spoke to Facebook about the phenomenon and they promised to correct it.
Did they follow through? Almost certainly.
In our latest Authority Report, we saw Facebook rise, once again.
This follows their trend of continuous growth, but it also appears likely that some of the growth came from corrected referral data. Prior to the mid-November iOS app update, visits identified by user agent dropped from around 20% of direct traffic to less than 5%.
It’s clear, for now, that Facebook data is much more accurate in Parse.ly, and likely in other analytics platforms as well. Dark social, once again, can be defined as mostly email and 1-1 communications.