Anyone who went to the PHP Conference 2013 in London hopefully saw Ilia Alshanetsky’s talk on analysing bottlenecks in your application, which had some great techniques for how to test the performance of your site. This article assumes you’ve taken a look at his slides or are already familiar with Google Chrome’s Developer Tools, so you can refer to the Network tab to gain insight into what is happening to your users. Other tools such as YSlow and Pingdom will also help in gathering data, though they don’t tell the full story.

What happens though if you complete your analysis and you do have a bottleneck, and it’s going to take a while to fix? Or you have some sluggish legacy code that is a huge lump of technical debt? Fixing sites is hard, and as engineers, we are frequently attracted to the most difficult technical challenges. We want to fix the problem, not mask it. That said, there are many ways we can deal with performance without actually fixing the problem, while keeping users even happier.

Speed vs Performance

First let’s quantify what we’re talking about here: speed. What is the difference between speed and performance, you may ask? The core of it is this: you care about performance, but your users care about speed. The average consumer of your site does not care how fast or slow your service layer is, nor about the efficiency of your database calls. They simply want to use your site as fast as possible, and an ever-increasing number of them are accessing your site on mobile devices.

Speed is therefore more than just how fast the page loads, but also how fast the user perceives your site to be. So, how do we improve this? Well, there is no single fix that will make your site incredibly fast, although some will have more or less effect than others. It requires a holistic approach of a great many small modifications.

Cycling Race

David Brailsford is the Performance Director of the British Cycling team – a team that has come to dominate world cycling, with 65 medals at the London 2012 Olympics, 29 of which were gold. In an interview after the Games, he cited, in addition to crediting the athletes themselves, part of their success to Matt Parker, Head of “Marginal Gains”:

“The whole principle came from the idea that if you broke down everything you could think of that goes into riding a bike, and then improved it by 1%, you will get a significant increase when you put them all together”

(full article: http://www.bbc.co.uk/sport/0/olympics/19174302)

This methodology works very well with websites too. What are these marginal gains we can make? They fall broadly into three categories:

  1. Give the user what they want (and only what they need).
  2. Give them what they need faster.
  3. Don’t make them ask for it lots of different times if you can give it to them all at the same time.

 

All of this seems fairly straightforward, and some of this has already been covered in the post on web performance tips on the main Inviqa blog. There is a another collection of tweaks however, which I guess could be described as “magic” – or more accurately, illusion.

These “sleight of hand” tricks give the illusion of a fast site, even if something is taking a while to do. Recently a friend asked me to put together a website for his cottage industry. He paints miniatures for tabletop wargamers, and he has a level of control with a brush that is simply incredible; I could easily believe he could put makeup on an ant. As you can imagine, his website is very much about showing off his work, and so he has lots of high resolution images of models he has painted. These images are stored on Flickr and displayed via a jQuery carousel on his front page.

The problem is of course, that these images are huge, and his customers want to see his handiwork right away. You can optimize, crop, and play with the image only so much before a site showing off artwork is soon showing off fuzzy quality images that detract from the message he’s trying to convey. In the end, we put up his favourite image from the set statically on the page, then as soon as enough images were loaded, replaced the image with the carousel, fading in the new content. The effect was that his users believed that the page was loaded extremely quickly, as opposed to waiting for the carousel which would have led to the page taking many seconds to load.

Of course, this isn’t limited to static binary assets for giving the user an illusion of speed. A popular website design is to have a navigation bar across the top of the page, indicating the user is logged in by their username and/or avatar being present. If a user wishes to modify their profile settings, this will mean fetching data from one or more sources to populate the forms. But what do we already have? Well, we can at least populate part of the form with the same details we have present in the navigation already, and then bring in additional details via AJAX. The net result is that the site feels more active to the user, because they have something they can do immediately.

Don’t Overlook Mobile Connectivity

These days, when Web developers refer to “responsive” websites, we are referring to Responsive Web Design. “Responsive” can also mean reacting or replying quickly or favourably; this becomes more important to consider in design as users of websites increasingly interact via mobile devices. We tend to forget while working on a laptop with a constant Ethernet or reliable Wifi connection that this is not the user experience of someone out and about on an iPhone.

Emulating the experience of a mobile user is very hard to emulate if your development environment is a VirtualBox instance running on your local machine. Network latency can of course be synthesized (check out our previous post about Charles), but it isn’t a natural step, and you would want to switch it off when not testing for front end performance. The other downside is that it can give you a marked advantage to your local instance versus assets on a CDN (Content Delivery Network): given a round-trip to get jQuery from your local /js directory, compared to getting it from something like the Google Ajax APIs, Google will seem slower, which is unlikely in reality. As a result of all these factors, I recommend that you get a “playground” – fire up a cloud instance and test performance against that.

Are Public CDNs Right For You?

Since I just mentioned the Google Ajax libraries, this is a good time to talk about specifics. A lot of developers immediately assume that if they use Google to provide a library such as jQuery, then you have improved your site loading times. This is possible, and there is also the advantage that Google serves these under both HTTP and HTTPS, allowing you to use the non-scheme-specific URL //ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js to get the file. If your site only used this file for JavaScript, all is well, but if not then there is a “gotcha”. Let’s take a popular example of an HTML5 boilerplate, initializr. This gives you a responsive layout, Twitter Bootstrap, jQuery and Modernizr, all in one bundle. It’s one of my personal favourites, so let’s look at some (abridged) code:

This works well, but you have five JavaScript files being included on this, and probably every other, page:

  • modernizr-2.6.2-respond-1.1.0.min.js
  • jquery.min.js from Google’s CDN
  • bootstrap.min.js
  • js/plugins.js
  • js/main.js

 

As these are across two domains, that’s two DNS lookups and six requests (one for the page itself, five for the JavaScript alone, and we haven’t touched CSS yet). Compare this with creating an init.js file with all of these files combined, it becomes one DNS lookup (which the browser has already done, to find the page) and two requests (again, including the page). In such a circumstance, is it worthwhile using the Google CDN? You could of course combine just the local files together, resulting in two DNS lookups and three file requests, but you need to actually play around to see which works for you (read Adam’s post on multiregion deployments for examples on how to establish what works best).

There’s a common misconception that, by using jQuery directly from Google, there is a greater chance of it being already cached in the browser from visiting a previous site, and that the browser will have already done the DNS lookup. However, as most sites want to link against a particular version of jQuery for stability, this can mean dozens of minor versions. Additionally, Mobile Safari doesn’t persist cache over application restart, so the chances are smaller than you might initially assume. Relying on your user already having done something elsewhere is not a strategy to rely on for performance, so always test your assumptions in your playground.

Prefetching Domains

While browsers are getting better at preprocessing HTML to gather domains referenced in the HTML ahead of schedule, not all of them are, and there also cases where content is dynamically loaded in from another domain. In the earlier example, an API feed from Flickr was consumed, and then the images were displayed directly into the carousel. When the browser comes across this API feed (and subsequently the images, as they are static files and only the information on how to concoct their URL comes from api.flickr.com) it has to do a DNS lookup, adding typically 10-30ms to each request. Taking the worst case, that’s 30ms to get the API feed, and another 30ms to get the IP of whichever static farm the image is hosted on! Adding 60ms to the lookup time is something that can be avoided, however. Let a browser know that somewhere on this page you will need content from a particular domain by using dns-prefetch tags just after the opening <head> tag:

This tells browsers that support dns-prefetching that they should, in parallel, look up the domain specified because it will be needed at some point. This prevents DNS blocking when fetching assets. There is some debate as to whether domain sharding is still a good technique, but if you use it, then prefetching will also speed up browsers if the asset domains are completely separate. The issue is muddied by more recent browsers ignoring the RFCs on HTTP pipelining, so again, test your assumptions.

Using JavaScript Passively

So far the DNS lookup times have been improved, one large file has been created that has all of the site-generic JavaScript in it. Two questions then:

  • how do we manage this file and dependencies
  • how do we use it?

 

It is common practice to place the JavaScript includes right before the closing </body> tag, or at least add defer or HTML5’s async attributes:

However, we want to also put JavaScript throughout the body of our page, and the techniques covered so far don’t deal with dependencies, only with loading the library in a non-blocking way. If we think of a simple Bootstrap page with three columns, and we want the leftmost column to contain the latest Tweets from our Twitter account, then the JavaScript to initialise this is best placed within the block. As we widgetize our sites more and more, we want to write small, easily-reusable components, and that means that we have JavaScript within the same logical block as the element that it relates to. If this application uses a templating engine that is aware of placeholders, we could append this block to the bottom of the page, but we still have a dependency: the library we are relying on must be present. Most of the time we work on the premise that if we put a library like jQuery in the <head>, it is hopefully there by the time we use $(document).ready(). In browsers that block while loading JavaScript in the <head>, this is guaranteed, but in that case, any further page processing is blocked until the file is available. More and more browsers are behaving better when getting elements referenced in the head of HTML, so it is increasingly likely that the library is there when it is needed. However, with 3G mobile devices having an average connection speed of under 2Mb, this is never a certainty, especially if your first use of it is early in the body of the page. We need to defer until at least the core dependencies are in place. Let’s take a look at some code, modifying again our initializr template.

What is happening here? The script is creating a small JavaScript object called tp (for TechPortal), and defensively checking to see if it already exists, in case this script is being included in the page in an unexpected way (such as Lightbox, or an Ajax call). We also give it a property, fn, which is an array to hold callbacks.

Now let’s add a JavaScript call to the page. The earlier example was a Twitter feed, so let’s assume we have a file, tweets.js, which requires the jQuery UI library. I’m basing this example on the Sea of Clouds Twitter widget, with some additional jQuery UI functionality thrown in. Our template requires Modernizr to work properly, so we’ll simply use the built-in Modernizr loader, which is based on yepnope:

Here, an anonymous function is appended to the fn array, and Modernizr.load() is used to retrieve the dependencies. Once these have been loaded, the tweets are then initialised, targeting the named <div>. However, this code won’t actually do anything just yet, because it is simply a function, and nothing has called it.

Let’s create a file called init.js. We’re going to use this file on every single page of the site, so we’ll prepend minified jQuery and Modernizr to it to reduce requests:

And now we append this to our example page:

Using this method, we ensure that there are no blocking or dependency issues for the JavaScript on page load, while also allowing us to have libraries available without having to have advance knowledge of every single widget on a page. The page will load without needing any JavaScript until it reaches the closing </body> tag, then call in the site-wide Modernizr and jQuery libraries in one file, reducing requests and DNS lookups. After that, it will iterate through all the page-specific JavaScript and sort out any dependencies as necessary, and finally execute any additional functionality nested within these latent functions.

We could also add into init.js all of our bootstrap and responsive files as well, although if there is a lot of HTML on the page, it may take a while before the browser reaches the tag to include it, which could leave the site looking odd for a second or two until the responsive elements are brought in. In which case, experiment with placing it in the <head>, keeping the defer and async attributes, and see how performance is affected by this change.

As the loop to call the anonymous functions is wrapped inside jQuery’s $(document).ready() event handler, it should still gracefully wait until the page is loaded.

Test, Test, and Test Again

You should (with effective analytics and logging) have an idea which browsers and devices are being used by your users, and be able to test to target these. Other considerations could include whether tweets.js is the only library that needs jQueryUI; if so, is there a benefit to having the two libraries separate, or instead combining them into one file with jQueryUI prepended? If there are only a few libraries like this, all using jQueryUI, another option is having a libraries.js file where all the libraries live, or including jQueryUI in the site-wide init.js, or simply having all of the JavaScript inside init.js. Again, test to see what works for you, and more importantly, your users.

This also applies to CSS files as well: you need to find what works for a specific application. A really handy tool for concatenating and managing CSS, LESS and JavaScript is CodeKit. It watches your project folder and then automatically generates aggregated files. With it, I can easily choose to prepend or append libraries, process LESS files and output CSS, and also choose to minify the output if I wish. This means I don’t have to worry about writing my custom JavaScript and CSS files in a minified way: I simply have the files in a resources directory, and tell CodeKit to output the files into the /js and /css folders in my html directory. It also has the YUI Compressor and JSLint built-in, allowing you to produce very small aggregate files and also avoid obvious parsing issues in your aggregated JavaScript libraries. CodeKit isn’t free, but I find the licence fee worth it for the time it saves me. Alternatively, you can use node.js and have the compilation of your LESS files and JavaScript validation as part of your build process.

In my next article, I’ll be looking beyond optimising the HTML and static assets of the page and look at how we can optimise the stack to improve delivery of your pages.v bv