Google Analytics - removing referral spam

Google Analytics is a great tool. Powerful. Free for small businesses. It's a million times better than AWStats or other similar free reporting tools that come with website hosting.

But it's not perfect. There are some serious issues that have escalated recently to the point where you just can't take the statistics at face value.

Update October 2015

From sometime mid September, it looked like Google had made a change that filtered out a lot of these sites as the numbers of spam referrals dropped significantly. I was, for a short time, heartened - but new ones (referral spam sites) keep popping up. So its still something you have to keep an eye on.

For our managed clients, we prepare a monthly report that provides various bits of information about how their website is doing and how to improve it. Some of the information comes from Google Analytics.

But we clean up the numbers before we report on them.

Junk in your data

The default view when looking at Google Analytics is “All sessions” - which as the title suggests, includes all visitors to your website.

At this time there are two main issues if you are looking at your data using 'All Sessions':

1. Referral Spam – These are other sites sending you tons of automated visits which are not real people and certainly aren't interested in buying from you. In some cases, they don't even land on your site but send fake requests direct to Google Analytics. We explain referral spam in more detail when they first appeared a few years ago.

2. Direct visitor spam. Also know as ghost visits - are a more recent issue. These can be the result of several activities not related to real visitors - including spiders crawling your site (to index or analyse them for their own purposes - mostly not above board). Some direct visits are the result of your analytics tracking ID being hijacked.

We've also seen less common issues arising from:

  • Adwords traffic being sent from agencies lumped together without campaign data like top keywords, top ads etc.
  • Event tracking spam.
  • Untagged campaigns - the most common being email newsletters - so visits come in as 'direct' when the person has actually clicked on a link.
  • Long visitor session times can cause time-outs, with Analytics registering a new visit. These usually come up as a self-referral.

This doesn't count issues with the way Analytics tracking code has been implemented, which can cause all sorts of different issues.

How we clean up the numbers

At this point we are using custom segments that exclude spam and fake direct visits for reporting purposes. And you can use filters to remove them from appearing in analytics the first place.

You can also:

  • block certain URL's using your htaccess file - this stops them getting to your site. We only do this for the worst offenders since they keep changing their originating URL's and you have to keep updating your htaccess file. Which can get bloated and unwieldy as continuously adding new offenders potentially affect the performance of your site. This won't get rid of the pesky direct spoof visits because they never actually land on your site.

  • add filters – this also works, but only applies the filters from the date you add them going forward. Which is fine but if something dodgy appears in the last month, the filter won't weed them out.

We don't cover how to apply these methods here as it would make for a very long post - but you'll find lots of helpful guides on-line suitable for your set-up and preferences.

Here is Google's guide to segments and our own Introduction to Google Analytics Segments.

Why bother?

Here is an example of just how badly this junk can skew your data. This shows the raw data (all sessions) and the filtered data (all sessions minus the junk):

analytics screenshot

You should be regularly looking at how your website is performing.

While at the end of the day, sales is really all that counts for most businesses your statistics point to the things you can do to increase revenue via your website. Whether it is to increase leads (visitor numbers), improve the quality of leads or improve your conversion rates – your statistics will show you the weak spots.

Including this junk will mean you will be working with incorrect information – not just in terms of visitor numbers but location, engagement rates and interaction patterns.

Which means you will make the wrong decisions.

So, if you do look at your analytics on a regular basis, start looking a bit closer and filter out the junk.

A temporary solution

You can easily filter out both direct and referral visits because these are where most of the spam and/or automated visitors are going to get recorded.

You will also filter out the legitimate ones, but it may be worth it if you don't get many quality leads from these two sources.

1. Login to Analytics

2. Click on Add Segment

add segment

From the resulting list of possible segments already available, you could scroll down and select 'Excluding Direct' and 'Exclude Referrals', then click on Apply.

But this would give you this:

Two Segments

Which isn't quite what you want - although you can see your visitor numbers without the direct and referral visits, it doesn't give you all the rest in a nice easy to analyse lump.

So, a better way is to:

    1. Click on Add Segment
    2. Click on the ' + New Segment' button
    3. Give your segment a name – e.g “Exclude Direct AND referral visits”
    4. From the left hand menu, click on 'Conditions'
    5. Here you will add filters that will weed out any referral or direct visits from your data
    6. First, change the condition from 'include' to 'exclude' by selecting exclude from the drop down menu
    7. Click on the drop down box that starts with Ad content, and start typing 'source' until 'Source' shows up on the list.
    8. Select it
    9. In the input field next to 'contains' type in Direct – once you start typing it will show in a blue box and you can just select it
    10. Click on the OR button to the right of the input field
    11. Start typing 'medium' till it shows up in the list and select it
    12. Type in referral in the empty field

This should give you something that looks like this:

segment conditions 2

  1. To the right of the conditions area, Analytics will calculate and display how much of your data this will remain (so you can get a sense of the impact of applying the segment
  2. Save your segment

Analytics will automatically display your new segment.

To make it more useful, you can drag and drop the segment names. Drag 'All Sessions' to the right of your new segment, so that your new segment is displayed first and you get this:

result

So, what you have is your visitor statistics without any that came from referral sources, or that came direct. In the example above, you can see that the bulk of visitors came via other sources (e.g Google search), so this approach while not perfect, will be a little more focused.

We provide monthly analytics reports as a stand alone service for anyone, anywhere that has Google Analytics installed but doesn't have the time or inclination to figure out what it all means. Get in touch if you are interested.

Essentee

Contact Us

+64-9-483-9190

P.O.Box 34588 Birkenhead, Auckland 0748

Send a message

Client words...

I needed development of a new website because I didn't have a site and needed a clean design for my service business.

I chose Essentee because of past engagement and site management meant we knew what to expect and what was possible.

Working with them was easy and straight forward. Clear on what can be achieved and budget.

Willowgrove Consulting