The definitive heatmap

The final HeatmapAfter the interest shown about the clickmaps / heatmaps articles, I’ve decided to gather all the information into an easy to use system. What we are going to make is a complete solution that allows collecting, analyzing and showing the click information our users give us. Now, it works in web pages not center aligned and is quite a bit more robust. Read on…

What?

If you are a webmaster, you had probably thought about what do users do in your website. Beyond usual statistics, clickmaps allow you to find where your users are clicking. This is quite useful to find areas in needing of change, layouts that don’t work as intended or anchors that aren’t being understood as you would like.

You’re going to be able to find every single click your users make in your website, being over a link or even in blank areas. We are going to do it the following way:

The proccess

We need to divide the full proccess into some manageable steps that use some open source tools. Since I work both in windows and linux systems, I’ll be OS agnostic and use only tools available in most systems, including Mac OSX.
The main steps and the tools they use are the following:

  1. The collecting (javascript and apache)
  2. The processing (ruby and imageMagick)
  3. The showing (javascript)

The collecting

We are going to use a small snippet of unobtrusive javascript to allow the client to tell our server the click positions. Just place this small javascript file at the very end of your template, right before the closing <body> tag:

registerclicks.js

The code adds a onMouseDown handler to the document, executes a function for every click and returns true, since we want the user to follow the normal navigation. Then, when the user clicks any part of the page, a tiny request is going to get sent on the background to our server. The script has to calculate the offsett of the first element inside the <body> tag, because most pages arent aligned to the top-right corner. In liquid layouts the system is not going to work at all

The request is sent via a HttpRequest object that calls a file in the server. In last version, I used a small GCI written in perl to log the request and return an empty document, but since we want to serve so many request, there’s a better method to apply. Using a perl CGI, in a modern server, we get the following results benchmarking with apache bench (100 requests, 10 concurrent ones):


Concurrency Level: 10
Time taken for tests: 6.537187 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 17100 bytes
HTML transferred: 0 bytes
Requests per second: 15.30 [#/sec] (mean)
Time per request: 653.719 [ms] (mean)
Time per request: 65.372 [ms] (mean, across ...)
Transfer rate: 2.45 [Kbytes/sec] received

Mod_imap

Apache has some modules that work the following way:
You define a handler and what you want to do with it. Some of them are well known, like mod_perl or mod_cgi, but a lesser known one, called mod_imap, does exactly what we want. It’s a module meant to return server-side image maps, but if we use an empty map file, all we get is a 204 status (no data) and a logged transaction. The difference is quite significative. Using Apache Bench with the same configuration, this is what we get:


Concurrency Level: 10
Time taken for tests: 0.106316 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 36464 bytes
HTML transferred: 20246 bytes
Requests per second: 940.59 [#/sec] (mean)
Time per request: 10.632 [ms] (mean)
Time per request: 1.063 [ms] (mean, across ...)
Transfer rate: 329.21 [Kbytes/sec] received

That’s 950 requests per second vs 15 with the CGI method!!! We are almost a hundred times faster with this approach!The only thing we have to do to use this mod_imap is to touch a little bit the apache configuration file. Do it carefully because it can hurt your entire server. In the relevant section add the following lines:


AddHandler mod_imap .map
CustomLog /tmp/clicklog clicklog #or modify according to your system

And define a custom log in the same file adding this:

LogFormat "%q,%{Referer}i" clicklog

This way, everything ending in .map is going to be treated as a server-side map, and since the map is empty, it’s not taking your user anywhere. But it logs it, in file /tmp/clicklog (YMMV).

The log analysis

Since we used a logFormat apache directive to write our log, the format should be easy to parse. The query string is written in the log as it comes, and the full lines should be in the following format:


?x=483&y=32&dx10&dy15,http://demo.html
?x=461&y=177&dx10&dy15,http://demo.html
?x=408&y=40&dx10&dy15,http://demo.html
(...)

I decided to write a Ruby script to parse the file and generate the final images, because I hadn’t used ruby before and thought it would be a good way to approach the problem. Last time I had written an structured perl script, but I think that object-oriented is the way to go in this particular situation, since the objects should be well-defined and dividing the program among several coders should be easier too.

Update:Thanks to Jerret, this part has been enhanced using RMagick. Part of the code below can be updated and works some 50 times faster. On top of that, a new sourceforge project has been started at http://sourceforge.net/projects/clickmaps/ under a GPL license. Of course, if you don’t want to install/use RMagick you can still download the original version at the end of this post.

I´ll try to explain the model. It uses five classes:

Conf:
Sets some configuration variables and returns them as a hash. This way, every configuration variable is set in this class and it’s easy to get them later on

conf.rb

Readparsefile
Reads and parses the file defined as logfile in the conf object. For each log line, it stores it into a click object and append it to an array. There are two methods that return all the URLs in the log file (geturls) and all the information for a single URL as a Log object

readparsefile.rb

Click
Stores the data in each log line, including X, Y and URL. Provides a method (xy) that returns an string like “x100y200” to compare the exact coordinates, useful to extract the maximum number of times a single click is repeated

click.rb

Log
Stores all the values pertinent to a single URL and gives accesors to them. There’s also a “next” method that returns next click within the same URL

log.rb

Image
Receives a log object and the conf object. There are three methods to normalize the spot we’re going to use as a click indicator (normalizespot), compose every click as a dot (iterate) and colorize the final image (colorize)

image.rb

Then, the main program is only eight lines long. It leverages the objects’ methods to be as compact as possible. In fact, the only thing it does is to iterate over each url to create a different image.

conf = Conf.new
file = Readparsefile.new(conf.data['logfile'])
file.geturls.each do |url|
    image = Image.new(file.coordsurl(url),conf)
    image.normalizespot
    image.iterate
    image.colorize
end

Is it better?

You can find another program (this time written in perl) in an older post that does a similar job of making heatmaps. But there has been some modifications that makes this an usable system instead of a proof of concept:

Flexible configuration
Over the harcoded last version, in this one is quite simple to modify the images used in the heatmap generation, or the log name. You only have to modify the Conf definition. It would be so easy to use an external conf file, but doing it this way is quicker for me
Multiple URL support
While last version only let you extract one image, this one makes a heatmap for every URL in your log.
Much faster execution time
Instead of composing the full image everytime, now we create a single ImageMagick sentence to do al the composition for us. That gives us a couple of orders speed advantage. Last version lasted about fifteen minutes for a couple hundred clicks, and now it’s about five seconds. Please note that, for many clicks, the program uses quite a bit of memory. Probably for a production environment it would be neccesary to divide the compose sentence into manageable chuncks, and iterate at the end with them to create the final heatmap.
Manual capture is not needed anymore
Since the last step is to decrement the opacity of the map, we can use a little bit of javascript to overlay the PNG image over the original page. So, the stakeholders can review it without someone manually capturing the screen. This way we don’t need to set an XServer in the production environment
Easier to maintain and extend
The object oriented paradigm doesn’t give us faster code, but much more manageable one. You can extend it as you want

What you get

Now, you’ll have several images. Most of them are OK to delete, but there’s one ending in final.png that’s your heatmap. We’re going to overlay it on top of your web page. That image should be a semi-transparent PNG like this one:

The overlay

This is the final part of the proccess. We already have the overlay image and all we need is a javascript snippet that can be called anytime and that creates a layer on top of your website with the click information. Just like the first step, we’re going to position it over the very first item in the page.
The best way to do that is via a bookmarklet, that is, an small javascript snippet saved as a bookmark. This way, you can have it in your browser and ask for the overlay image when you feel like. The javascript recalculates the offsets of the first element inside the <body> tag and writes the heatmap image on top of it.

overlay.js

The result

We got a beautiful heatmap on top of our web page. We can call the overlay from wherever we want and show it to the project stakeholders. Look at the result:

The code

I made a ready to download package with all the code. It’s released under a MIT license that means that you can do whatever you want with it. Probably in the future it’ll be part of an open source release; if you feel like, start it yourself or contact me for more information.

Download code. Tar.gz file

What else?

The sky is the limit. If you want a hosted service, contact us. Our company can give you bespoke solutions to all your web intelligence needs, being it log analysis, path tracking and so on. If you’re a developer, feel free to use all the code as you wish, and please write me to tell your experiences. Stay tuned!

By the way, there has been a post in remysharp blog explaining how to record the clicks in a different server. Thanks.

93 respuestas a “The definitive heatmap”

  1. Pingback: Anónimo
  2. Hello,
    Can you explain how the mod_imap works? Which script exactly calls the .map file? I see no reference to it in the downloadable files.

  3. Hi, Daniel

    You have to replace in registerclick.js the line:

    var url=’guardacoordenadas.pl?x=’+tempX+’&y=’+tempY;

    For

    var url=’http://mydomain.com/empty.map?x=’+tempX+’&y=’+tempY;

    where empty.map is the empty .map file in your system. The sources are already updated. Sorry

    About mod_imap, you can find all the documentation at http://httpd.apache.org/docs/2.0/mod/mod_imap.html

    We are using it in a twisted way, but as it’s one of the core apache modules, it’s not going to dissapear soon.

  4. Fantastic, I can’t wait to get working on this.

    Suggestion, though. To get this working on liquid layouts, couldn’t you also send the dimensions of the viewport (using innerWidth/innerHeight ) and track those dimensions as well?

    I’m not sure if you can do a prompt within a bookmarklet, but you could always create multiple bookmarklets, each for different dimensions?

    Just a thought, otherwise great job.

    leo

  5. Hi, Lain,

    Don’t hold your breath. The only way I can think of would be a div-based page division, to create small heatmaps positioned relatively to the div corner.

    That would probably be too difficult to develop in a general way, so, it’s not being taken into acount right now.

  6. Hi Tazo,

    I’m right back from holidays. As I have reeived so many enquiries about that, I’ll try to write an installer (fully ruby based) and make it available at sourceforge, with documentation.

  7. Your map seems off. Look at the heat points to the right, it’s as if whoever clicked on the links was viewing your site with a different resolution.
    It’s a nice idea though, and on the question of floating backgrounds, wouldn’t it be possible to also send back the dimentions of the browser window and then interpolate the results? It would work, unless you have a *partially* floating design, where some elements have fixed widths.

  8. Hi, Original Sin,

    I did some random clicks on the page to try out the system, so, that probably explains the offset data.

    About the interpolation idea, you’re right. Only trouble is that liquid layouts vary the height of the columns and all the divs (or even all the tags) would have to be treated separately.

  9. david-

    great job on this. unfort. i don’t know if i can get it working as i have no access to my shared servers httpd.conf.

    the mod_imap directives can be set in .htaccess, but from my understanding you cannot set CustomLog or LogFormat there.

    any ideas on a workaround? or should i give up and try the old cgi version?

    many thanks in any case for the fun thoughts…

  10. Hi,

    It seems like a great script, but somehow I experience problems, when I try to run it in “Ruby1.85” I get the following error:

    Invalid Parameter – -fill
    Invalid Parameter – 515×77
    Invalid Parameter – -negate
    Invalid Parameter – -type
    Invalid Parameter – -channel

    Even though I have tried to read into all the great documentation, I struggle to find the solution to how to get the script to work.

    Should I create images, install RMagick or some third solution?

    Thanks in advance.

    Jacob

  11. Rmagick is needed for the last version (~30times faster) but not if you use the previous one. I’ll upload it to sourceforge as soon as possible to be ale to download the one that suits you.

  12. I think this is a little off… I ran it and everything works great but all of my clicks are off to the right too. The math is off somewhere. Other than that very nice job!

  13. hi, thanks for sharing your work

    For some reason Im unable to generate any clicks to the clicklog using apache2.2

    I try going directly to the empty.map and it generates nothing in the log. So it has nothing to do with the javascript. It also of course doesnt work with the javascript either.

    Again, i added those lines, stoped and started apache. The clicklog is created, but it just does not log anything

    Im thinking maybe something has changed in apache2.2?

    In the documentation they say use:

    AddHandler imap-file .map

    I tried both that and

    AddHandler mod_imap .map

    Still no go. Just wanted to see if anyone ran into a similar problem.

    Thanks

Comentarios cerrados.