The answer: it’s all about perspective. I recently received an email from a reader that went like this:[quotebox]For 8 years we’ve been using Urchin 6, provided by our web hosting company. Now that we’ve switched hosts, we started using Google Analytics, and instead of reporting 50,000 total “sessions” a month, we now see 10,000 total “visits” a month. How is it possible for the numbers to be so different, when both companies are owned by Google? Which numbers are correct?[/quotebox]
This is a great question, and not an uncommon one to hear. I want to delve into what goes into the differences that commonly arise between web analytics tool, even ones from the same company with a shared past (Urchin and Google Analytics, case in point).
Know Your Tools
With this question in particular, the case is one where the products being compared (Urchin vs. Google Analytics) are likely not very close, despite both being owned by Google. Given that the asker noted using “Urchin” for 8 years and that it was provided by their hosting company, my guess is that they are using Urchin 5, or even 4 since Urchin 6 was released in April of 2008. The hosting company may have upgraded, but still it is a far cry from saying the tools are the same.
Know What You’re Comparing
There are two key ways to get web analytics data:
- From web server log files
I like to think of the difference in these two tools as looking at the top-side or under-side of the same rug. The under-side shows a complex mis-mash of strings while the top-side shows a beautiful pattern. Web server logfiles are literally the server’s perspective on what happened, while the tag-based method is literally the user’s perspective on what they did while on the site. This is the most critical aspect to consider. If you want to answer questions about what the servers were doing, look at server logs. If you want to answer questions about what people were doing, look at tag-based data. For more on this, Brian Clifton does a great job going into more detail on this topic in his book.
Understanding Data Sources
The server log contains ALL hits, whether from humans or non-humans. I find usually most logs are comprised of about 60% data from non-humans, i.e. search engine robots, content scraping bots, etc… An improperly configured Urchin profile will report this all as the same and will identify sessions based on simple IP Address + USer Agent combinations of hits in a 30 minute time window. Thus, you usually get a much larger number of “sessions” reported than what is actually happening.
Truth is Rather Gray
It’s not that IP+User-Agent data is “wrong” per-se, however it must be interpreted in its context. If it were up to me it would be a crime for hosting companies to not make this clearly known, because you’ve been thinking you’re reporting “people who visit the site” when in reality you’re vastly over-reporting that number because the bots are probably included.
All this to say, a disparity is common.
If you have questions about the quality of your data I would recommend that you conduct an full audit of your data. Sometimes if your GA/UTM tags aren’t placed on all parts of your site the data will be falsely low because it won’t be complete. To really be able to unravel the knot created by this and explain it to executives you’ll need to be able to show why the numbers have changed, explain what goes into the numbers from each system, and help guide the transition to better data.
If you have your old server logfiles around and your Urchin profile allows filtering and re-processing then we can take some measures to filter out bots, but the numbers still won’t line up well – think of it like measuring the height of your desk in Centimeters vs. Inches – same desk, different scale.
A few tools that you can use to help auditing of your data:
- Observepoint’s tag auditing tool
- Analytics HealthCheck for checking the integrity of your Google Analytics data
- Get a free demo of the Urchin software and run your server logs through it
- Get an expert to help – there are over 150 certified companies for Google Analytics worldwide, including yours truly
I hope this helps and look forward to your comments and questions on this topic!