Earlier today I installed a WordPress plugin recommend for tracking the popularity of posts. The plugin is unsurprisingly named "Recently Popular". After installing the plugin I ran some quick tests and found that I was getting extra hits recorded. I spent a bit of time back-tracking to find the source and after systematically disabling all other plugins and page elements found that it was firing in wp_head() in the page header.
After some more digging, I noticed that the extra hit was for the chronologically next published post and that the problem occurred in both WordPress and WordPressMU. This wasn't making a lot of sense so I decided to try a different browser - more of a sanity test than anything. That's when I found it didn't occur in Chrome, or Opera - just Firefox 3.5.6 that I'd upgraded to a few hours earlier.
I fired up the Live HTTP Headers add-on and checked out the requests Firefox was making. It was definitely making both post requests. I took a closer look at the second request and noticed the extra header "X-Moz: prefetch".
A quick search for X-Moz: prefetch turns up Mozilla's Link prefetching FAQ which gives a good description of what is happening and why. WordPress creates a tag similar to the following when wp_head() is executed:
<link rel='next' title='The Next Post' href='http://your_domain/year/month/day/the_next_post/' />
I am unaware of anyway to disable the prefetch hints. You could edit your header.php and remove the wp_head() statement, but many plugins rely on the execution of this function so results could be unexpected and undesirable. The issue for me was not that the hint was published but that the prefetch hits were being counted as real post requests, as well as the actual request when I clicked through a second or two later. This would seriously skew the perceived popularity of posts.
My solution was to ensure that the Recently Popular plugin ignored post requests that passed the "X-Moz: prefetch" header. Depending on your server configuration, the method of checking the header exists may differ - apache_request_headers() (alias getallheaders()) is only supported when PHP is installed as an Apache module. Most servers should support checking for $_SERVER['HTTP_X_MOZ'].
I wonder how many other people will wonder why their page hit stats have mysteriously increased without any increase in ad impressions, etc.
I will contact the plugin author to suggest an update once I've published this post.
Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts
Thursday, 17 December 2009
"X-Moz: prefetch" and skewed page-hits
Friday, 9 October 2009
Hey, Where'd My Space Go?
Recently I was doing some routine stuff when I noticed the was a LOT less space on my main hard-drive than I expected. I was down to less than 500MB of space! I remember the days when I was smug about having a 30MB drive when the guy working on the next PC only had a 20MB drive. It's hard to believe in this age of relatively gigantic drives we can still fill them up without too much effort. I guess we can put it down to the ever increasing filesizes driven by higher pixel counts of digital cameras, the recent epidemic of software bloat, and the invention of peer-to-peer file sharing.
I needed to free up some space, so I dug out my favourite drive-space analysis tool. It's pretty lean and has a great interface, so I thought I would share it here.

This is Steffen Gerlach's freeware application for Windows called Scanner. Once the application has scanned your drive, you can drill down through each folder of the sunburst chart to easily identify what has been gobbling up your drive space. Admittedly the initial scanning can take a few minutes, but no more than it takes to grab a cup of coffee.
I needed to free up some space, so I dug out my favourite drive-space analysis tool. It's pretty lean and has a great interface, so I thought I would share it here.

This is Steffen Gerlach's freeware application for Windows called Scanner. Once the application has scanned your drive, you can drill down through each folder of the sunburst chart to easily identify what has been gobbling up your drive space. Admittedly the initial scanning can take a few minutes, but no more than it takes to grab a cup of coffee.
Thursday, 9 July 2009
Tracking Email Clicks in Analytics

It's common practice to send registered users an email to confirm account activity, to keep them up to date via a newsletter, or to try and encourage return activity. In many cases we are not really measuring how effective these mailings are or how they impact on our website traffic.
At first glance it looks like a tricky problem, mail client applications will generally not pass a referrer and browser mail will be recorded as one of the hundreds of mail domains in use. Link Tagging is the simple solution, although there are a few options depending on how deep you want to go.
Source and Medium
By appending utm_source and utm_medium parameters to your links you can easily track who many visits are directly attributable to your mailings and see them in the All Traffic Sources report.
Here's an example of how your links should look:
http://www.yoursite.com/somepage.html?utm_source=Newsletter&utm_medium=email
Setting the utm_source value will replace any referrer value as the Traffic Source so random browser domains will be consolidated under one value, along with any email clicks with no referrer which would usually be classed as "(direct)". This is the only required parameter of this type, any other utm_xxxx fields used in conjunction with utm_source are optional.
Using utm_medium=email is recommended, especially if you are using more than one utm_source value in different email types (e.g. Newsletter, AdminEmail, ReferAFriend) so that you can easily filter the results on the All Traffic Sources report.
Campaigns
Specifying a utm_campaign value can help group your links in a more meaningful way. This could be a sub group of your source categories (e.g. utm_campaign=200907 to identify this is the monthly Newsletter for July 2009) or you could use a campaign like utm_campaign=Winter-Sale across many sources (email, banner, CPC, etc). It all depends on what you want or need to measure. Whatever you choose, any utm_campaign values tracked will be displayed on the Traffic Sources->Campaigns report.
Content and Terms
These options are less common but still useful. Setting the utm_content parameter could help identify if text or html emails are getting more clicks. Alternatively you could track the comparative success of different creatives from the same campaign. utm_content values tracked will be displayed on the Traffic Sources->Ad Versions report.
I've included the utm_term here just for completeness. It's usually used to identify search terms or keywords purchased. utm_term values tracked will be displayed on the Traffic Sources->Keywords report.
Handy Hint
Even if you're already tracking your email clicks with another solution, it's probably worth adding these parameters (or at least some of them). As long as they are passed to the landing page it doesn't matter if you add them to the pre or post tracking URL. You may do some special tweaks so that your tracking solution passes utm_xxxx parameters on to the destination URL.
Sunday, 17 May 2009
RSS: Learn To Burn...

RSS is a tricky thing to measure, requests are not tracked like normal webstats, and are commonly anonymous or via a proxy. The frequency of requests is dependant on the user's feed reader and could be daily, weekly, hourly or even every minute (or anything in between). This is why it's important to have a good grasp of how much bandwidth and processing resources your RSS feeds are using.
RSS Caching
The easiest way to offset the processing cost is to cache your feed. Depending on your site's publishing schedule and implementation, the caching period and method used will be different. The basic idea is to dump your feed into a file and serve that.
On each request check if the cache file exists and if it is younger than 15 minutes. If not then build the feed and dump it into the cache file, ready for the next request. Depending on the frequency of requests, this can reduce your feed building resource cost considerably.
Introducing FeedBurner

FeedBurner works very well with the major blog publishing sites, but it's also worth investigating if your site is standalone.
The initial payoffs of using FeedBurner is that they can help you get a handle on the size of your subscription base, and will cache and serve your feed, thereby absorbing much of the processing and bandwidth costs.
There has been quite a bit of discussion about the accuracy of subscriber stats provided by the FeedBurner service. As stated earlier in this article, RSS stats are problematic due to the plethora of clients and the complications of anonymity and proxy services. Having said that, they service offered is a lot better than no stats and in my opinion the benefits outweigh the cost many times over.
Don't Lose Your Audience
One of the important tips about integrating FeedBurner is to make sure that your subscribers still subscribe to your site's feed URL and are redirected to your FeedBurner URL. This way, if you ever decide to drop the FeedBurner service, then you won't leave your subscribers stranded with a defunct FeedBurner URL. Google has been quite open about this issue, if you know where to look.
If you are redirecting traffic you need to make one small change to your FeedBurner options to make this work properly - but it's not that easy to find... Click on the Optimize tab for your feed, and then BrowserFriendly in the Services menu. At the bottom of the form, in the Content Options section, there is a link with the text "Use your redirected feed URL on your BrowserFriendly landing page". Click on that and then enter your site's feed URL.
This change should result in most subscribers using your site's URL, however, this still doesn't seem to work correctly with Firefox's Live Bookmarks. I haven't found a decent work around for this yet, or even much evidence that it is an issue, but for me it never works, so be aware. Even the "ClearFeed" landing page is somewhat confusing when Live Bookmarks are used, which is a concern considering Firefox's popularity.
FeedBurner Pros vs Cons
Pros
- Free stats/caching service
- Reliable infrastructure
- Simple to use
Cons
- Stats are tied to a single Google login
- Subscription stats allegedly fluctuate
- Some Firefox Live Bookmarks issues
Monday, 27 April 2009
Always get a Baseline first
As developers, we often spend time optimising, tweaking or redesigning to increase performance. It's fairly easy to measure performance increases when optimizing code or systems, but it's a lot harder to gauge the effectiveness of User Interface changes.
In many cases the UI choices we make are subjective at best. In some cases, our design decisions can actually make things worse, not better. There are some interesting anecdotes about this in this recent post.
In order to make sure we are making the right decisions, rather than just a bunch of assumptions, we need to measure the effectiveness of the current implementation in order to compare with the improved version.
Depending on what it is that you're changing, this could impact on conversions, page hits, or data collection rates. You'll need to decide the best way to measure the effectiveness of the changes, but without data to measure it, you'll never really know if you're doing it wrong.
It seems like a simple idea, but it applies for all optimisation and is often overlooked. A simple rule to follow is to remember to ask "How will we know if this works?" before you implement a change.
In many cases the UI choices we make are subjective at best. In some cases, our design decisions can actually make things worse, not better. There are some interesting anecdotes about this in this recent post.
In order to make sure we are making the right decisions, rather than just a bunch of assumptions, we need to measure the effectiveness of the current implementation in order to compare with the improved version.
Depending on what it is that you're changing, this could impact on conversions, page hits, or data collection rates. You'll need to decide the best way to measure the effectiveness of the changes, but without data to measure it, you'll never really know if you're doing it wrong.
It seems like a simple idea, but it applies for all optimisation and is often overlooked. A simple rule to follow is to remember to ask "How will we know if this works?" before you implement a change.
Subscribe to:
Posts (Atom)