Capturing mobile operating systems in Google Analytics

One of the great features available in Google Analytics is the ability to view a wealth of information about users who browse your site on a mobile device, be it a smartphone, tablet or even an older WAP enabled device. This suite of reports is invaluable for anyone building a mobile specific site, and incredibly useful for those who just want to optimise their main website for their mobile traffic.

While the mobile reporting suites in both the older version of Google Analytics and the new Version 5 interface both provide a breakdown of the devices used to access a site and the operating system running on those devices, Version 5 also allows you to drill down and see the specific version of the operating system running on the mobile interface. Cool! It's always useful to be able to examine whether a specific variant of an OS may be having issues with displaying your site or allowing users to interact as intended. The more you know, the easier it is to resolve problems and keep things running smoothly.

Mobile operating systems report
Drilling down on a listed operating system should show a breakdown of versions

Unfortunately despite this functionality being available within GA v5, more often than not the actual OS version is not available - showing instead the rather depressing "(not set)" value. I can only assume that the team at Google are still working on the full implementation for this report. However all is not lost, and through a combination of some JavaScript magic and Google Analytics' very powerful profile filters, we can capture this data ourselves and drop it into the operating system versions report.

So where do we start? Well, whenever a browser makes a request to a web page there is a piece of data known as the user agent string which gets sent as part of the request. This string contains information on the platform and operating environment we are browsing on. For example, it may tell the website that I am viewing a page in Firefox v9, on a Windows 7 machine. Similarly in the case of mobile traffic, it could say that I am browsing on an Apple iPad running iOS version 5.0. This user agent string is where we will be extracting our operating system data from.

With this information readily available to us, our first step is to create a piece of JavaScript to dissect the user agent string and determine both the device type and the operating system (and its version, naturally). This script follows the following steps:

    • Determine the device type, for example an iPhone, iPad, BlackBerry or Android device. Knowing the device type tells us what operating system is actually running - for example we know that an Apple iPhone would be running Apple iOS in some form, or that an HTC Desire would be running Android.

    • Once we have the device type, we use a regular expression to extract the operating system version. This allows us to capture variable length version numbers (for example, we can't just assume that a device will be version 2.1, when a 2.1-update or 2.1.3 version may also be released), and also allows us to accomodate for changes in the format of the user agent string between older and newer devices. BlackBerry smartphones are good example of where the version information is provided in differing syntaxes between older and newer models.

    • Once we have our operating system identified and its version number, we combine them into a single string and send a _setVar command to Google Analytics containing this data. This drops the operating system and version into the __utmv cookie, and the data is automatically sent as a request to Google Analytics.

    To avoid bloating this post with a ridiculous amount of JavaScript  - you can view and save our copy of the code at the following link (this version tracks Apple, Android and BlackBerry devices): http://www.freshegg.com/js/ga-mobile.js

    Now I should point out that _setVar is a deprecated method, with Google recommending the use of custom variables instead of the one-slot method we've used as above. Unfortunately though, it isn't yet possible to do any profile level filtering on custom variables. The data stored using the _setVar method can be manipulated prior to reporting - and that's something we require later on.

    Once we've built our JavaScript we need to make sure it is loaded before the Google Analytics code on our web pages, whether loaded in from a separate .js file or simply inline (I'd recommend the former approach though as it is a bulky script). Next we need to make a tiny addition to the tracking code on our site. We will be executing our identify and capture script as a function call, which we push to the Google Analytics processing queue, as follows:

     _gaq.push(function() { checkMobileAgent(); });

    This code can be placed either before or after your _trackPageview call, but make sure that you send it in the same block of code.

    With our code built and tracking modifications in place, the next step is to set up a trio of filters on any profiles on our target web property that we want to apply this enhancement to.

    The first and second filters extract our OS and version details from the __utmv data, which is stored in the "User-Defined" filter field, and drop the information into the Visitor Operating System Platform and Visitor Operating System Version fields, respectively. Both filters are advanced custom filters, and are set up as follows:

    Filter 1 - Capture Mobile OS Platform from User Defined

    Filter Type: Custom Filter, Advanced

    Field A -> Extract A: User-Defined

    Field A -> Extract A value: ^([^:]+)::.+$

    Output To -> Constructor: Visitor Operating System Platform

    Output To -> Constructor value: $A1

    Filter 2 - Capture Mobile OS Version from User Defined

    Filter Type: Custom Filter, Advanced

    Field A -> Extract A: User-Defined

    Field A -> Extract A value: ^[^:]+::(.+)$

    Output To -> Constructor: Visitor Operating System Version

    Output To -> Constructor value: $A1

    The double colon '::' in the middle of these regular expressions is the separator I have used to split apart the OS and OS version values in the __utmv cookie. You can of course use any valid separator you wish - just don't forget to update these RegEx values if you implement a different version of the code I'm providing in this post!

    Finally, we have our third filter. This one simply 'resets' the User-Defined field as though it had never been sent from the __utmv cookie in the first place. This stops our User Defined report within Google Analytics from being cluttered up with OS data that we're sending to other reports anyway. Additionally, you may already be using the User Defined report for storing other information extracted from a source other than __utmv, and so you definitely don't want this data cluttering up your existing reports. So immediately after the first two filters, we add the following:

    Filter 3 - Prevent __utmv Cookie Data writing to User Defined report

    Filter Type: Custom Filter, Advanced

    Field A -> Extract A: User-Defined

    Field A -> Extract A value: (.*)

    Output To -> Constructor: User-Defined

    Output To -> Constructor value: (not set)

    Obviously, if you do use the User Defined report for storing anything else, then these three filters need to appear before your filter(s) that write to the User Defined report - otherwise you'll end up with nothing but (not set) data! Also ensure that your filter ordering keeps them in the order listed above, and that they don't get in the way of any other filtering you happen to be doing.

    With the filters in place and the code changes made, you're done! As visitors on mobile devices visit your site, you should now see OS version data available for any mobile user agents your copy of the JavaScript can pick up. The code we've provided isn't perfect, and you will more than likely still see some (not set) traffic, but you can of course modify and improve the script for your own needs and add other mobile devices to the detection process as you see fit!

    The resulting data appears as follows when you drill down on a specific operating system (using iPads in this example)

    Captured iPad visits and their iOS versions
    Captured iPad visits and their iOS versions