In the previous article on Speedy Sites, we looked at the stack and the various responsibilities of components in the stack. Now it’s time to look at one specific component, the application server, and how to increase performance.

There are a lot of different ways of serving PHP content. Apache, HipHop, nginx, lighttpd, and even IIS have various mechanisms ranging from simple CGI, to Zend Server. to precompiling to C, for interpreting a PHP file. While the choice of which is a matter of taste (or hot debate, depending on which camp you are steadfastly in), today’s post is going to look at one in particular: Apache – and in particular what can be achieved with the mod_pagespeed extension.

Life in the Old Dog Yet

The Apache Web Server needs no introduction. It has been the mainstay of PHP development since the earliest days, but today it is less likely to be your first choice if performance is your goal.
Asynchronous, event-driven servers, such as nginx, are getting a reputation for being able to handle high concurrency more effectively than the process-driven mechanism of Apache, leaving many development teams to relegate Apache to development environments and VMs. While Apache has variants such as mpm-event which address issues in keep alive connections (historically the bane of Apache performance and concurrency), these are still “experimental”: not a term that DevOps people like to see on software they install onto a cluster of machines under high load.

Apache has more than a few tricks up its sleeve, and while the new entrants in the web server arena will no doubt catch up, for now Apache suits our needs due to having one module the others don’t: mod_pagespeed.

At this point the nginx fans are shouting “but there’s a port for nginx!”, and indeed development of this is steadily progressing, but at time of writing, ngx_pagespeed is in “early alpha”, a bit too avant garde for us to recommend on a production server just yet, while mod_pagespeed is now a stable 1.x release. So, while those interested in the nginx variant wait for a version they’re comfortable to run on the live servers, let’s dig into how this module can make our Apache instance more performant.

PageSpeed: A Brief Overview

PageSpeed is an endeavour headed by Google that makes pages faster by optimising output. The project was begun in 2009 and officially left beta in November 2012, being essentially an output filter that itself is a group of configurable filters. These filters perform post-processing on the response to ensure that the headers, HTML, JavaScript, CSS and images are as optimised for browser performance as possible. While it isn’t as good as a human user crafting every aspect of a page for performance, it has the advantage of doing it for every page served: an almost impossible goal for a development team on anything other than the smallest of sites.

The Apache module, mod_pagespeed, is available as a binary for Debian and CentOS-flavoured builds, or you can compile from source. Installation is quite painless, though I encourage you to check the up-to-date documentation when you are ready to install this yourself. Currently, the installation process looks like this:

As a first step, let’s look through the pagespeed.conf file and begin tweaking it to meet our needs.

Pagespeed ships with a bunch of filters automatically enabled, so the first thing we should do is disable these. While they will almost undoubtedly help, we need to measure the impact of each filter individually. To make this easier, we’ll also disable pagespeed globally, just enabling it and configuring the filters we want in the appropriate virtual host. Joshua Marantz stated on Twitter that, while you will get savings in page load speed and overall performance, pagespeed can add up to 20ms on first-byte latency. If your server has a JSON-based API available or other consumer-based hosts such as oAuth provision, pagespeed won’t really help their performance. While pagespeed is smart enough by default to only filter against HTML, it relies on the appropriate content headers, so I usually recommend a whitelist approach to using it.

Disabling pagespeed globally is done by a minor edit to the pagespeed.conf file (on my system that’s /etc/apache2/mods-enabled/pagespeed.conf):

The ModPagespeedRewriteLevel directive disables the core filters.

Testing Performance

Those who read the first article in this series may remember we used the initializr HTML5 boilerplate as an example of in-page optimisations we can make for asynchronous loading and more efficient pages. To see what pagespeed is doing with each filter we will use this earlier example for our testing – but to highlight the changes we will deliberately not use the minified assets and instead use the verbose development versions (the Google API versions were removed and replaced with references to libraries on the host). Why are we testing only HTML and not PHP? Because we want to exclude any ambiguity in our results: Pagespeed is purely about optimizing output, so working with pure HTML ensures there are no inconsistencies or external factors in our experiment.

For each test, the browser and DNS caches were cleared so that a level playing field was established. For this article I focussed entirely on Chrome, but you are encouraged to test on a mix of browsers representing your target user base.

Each configuration was tested five times and the average result taken. Anomalous results that could not be repeated (e.g. a large fluctuation in time to connect to Google’s public DNS) were discounted. The starting state indicated the page was 137KB in size with all assets.

Setting up the VirtualHost

Pagespeed allows you to set up configuration on a per-virtual host basis, which gives enormous flexibility in tweaking for multiple sites running on the same instance. However, adding pagespeed at this level can mean adding a fair bit of noise to the virtual host. It may be that you don’t want to run your sandbox with pagespeed on, and this additional noise in the sandbox doesn’t make reading the file any easier. So, I prefer to put my directives for pagespeed in a separate file. Here’s the virtual host definition:

Note the defensive use of to ensure you can turn pagespeed on and off without breaking anything here. Performance is an optimisation, not a requirement, and anything in your virtualhost that is a nice-to-have rather than a must-have should be wrapped in an stanza.

Getting Small and Fast

Let’s start with some easy optimisations. We want to minify the JavaScript and CSS files used by the site to reduce the size of the response payload. This is something we can do either as a post-deploy hook (or as a manual step before committing code if your build automation doesn’t cater for hooks yet), but let’s see what pagespeed does when we enable this. Here’s the original head of index.html:

As you can see, we’ve got a section containing three CSS files, one inline statement, and a JavaScript file. Let’s turn on a few filters in the pagespeed.conf file for this virtual host (in my example, this file is at /var/www/mod_pagespeed_example/conf/pagespeed.conf):

There is great reference documentation if you want to read in detail about anything you see used here. With these settings applied, the output looks like this:

As you can see, pagespeed has concatenated and minified the JavaScript and CSS, and combined two of the CSS files into a single file. Subsequent requests don’t need further filtering as pagespeed caches the output (take a look in /var/cache/mod_pagespeed/ and you will see a file containing the entire response, including headers). The output is improved but isn’t anything extraordinary, and Chrome is saying we only saved 1KB of size: we could do this before deployment and then not take the ~20ms penalty on the response time. Pagespeed has also added two response headers, Etag, and Expires, with dates a year in advance, but again, this could have been achieved by simple server configuration.

To see another example of Pagespeed’s behaviour, let’s tweak the HTML slightly:

The whitespace and double tags are deliberate, and note that we’ve also moved the JavaScript above the CSS files which are now included as style @import tags. Time to add a few more filters, and in order to make our configuration file more readable we will break each filter out onto a separate line:

With the changed header and the settings above, pagespeed now produces this result:

Now, double <head> tags is an unlikely event, and this was a contrived snippet of code, but note also the JavaScript has been moved back under the CSS, and that now that the CSS is grouped together unbroken by the inline <style> tag; PageSpeed has correctly determined they should be concatenated into one file.

Let’s take a look at some filters which deliver some obvious savings, but which would be harder to solve in advance in other ways.

HTML whitespace

All that nicely indented HTML that makes the code easy to read is simply wasted bytes when we want to optimise for bytes-on-the-wire. Happily, Pagespeed has a few other filters we can turn on to make things smaller still. Let’s append three filters:

These changes take effect right through the page, here’s a section of the body affected by these filters:

Note how all the whitespace has been removed by the collapse_whitespace filter, and additionally all the double quotes stripped by remove_quotes. Note that this isn’t dumb stripping; the quotes that must be present around the classes are still intact. The third filter, elide_attributes, removes any unnecessary default settings from the output. What does this mean in real terms? Well, the initializr boilerplate is hardly a full web page, and starts out in this example at 6294 bytes. Pagespeed has reduced that to 4894 bytes – a not insignificant saving of 28.6% on the HTML payload, and a particularly useful saving if your user base is connecting on 3G. Let’s add a few tweaks to the HTML to show one further filter.

The default index.html file contains no links, but to demonstrate this filter we will need some. Let’s replace anything with =”#” to =”/sub-dir”. Then, from the commandline we’ll put a file at that URL:

And now, let’s add the new filter to our pagespeed.conf file with the following line:

This trim_urls filter strips all links to make them relative to the current URL, which results in outputs such as:

and:

Note though, if you do this, it might be a good idea to add a tag to the heading to ensure that the links can be understood, and be wary of strange results if you also use mod_rewrite in your application.

Domain Sharding

Domain sharding is the practice of splitting content across multiple domains for performance reasons, but it is contentious with so many browsers ignoring the RFC on HTTP pipelining whether or not this technique is as worthwhile as it once was. However, if you decide to do it, Pagespeed can do this for you without having add any application logic – reducing the need for workarounds in development sandboxes for sharding. Assuming you have a leftover vanity domain available, this is trivial to achieve. To adapt our existing example to serve content from the domain http://ps.wr.rs, we use the following settings in our pagespeed.conf file:

After restarting Apache, we will see the following output as a result:

At this point we have improved our output, but we can do more. If we’re adding a new domain to our content, it would be good to warn the browser this is happening. Let’s add back in the reference to the Google AJAX APIs in the head of index.php:

We can also add the Pre-Resolve DNS filter to our pagespeed configuration:

A note of caution though: Pagespeed currently can’t do this for the domain that is specified in ModPagespeedShardDomain; it adds the prefetch based on domains in the HTML itself. This has advantages for such things as JavaScript libraries hosted on third parties such as Google, Pinterest and Twitter.

There’s one final optimisation to make here. Pagespeed is gathering the resources via HTTP from http://localhost, as specified in our ModPagespeedMapOriginDomain filter. We can improve performance with a tweak to tell Pagespeed where these assets exist on disk. To achieve this, we add the following to our virtual host:

Note that this has to be in the main virtual host and not in a directory-specific stanza.

Optimising Images

Often, it is the images on a page that take the time to download as opposed to text. We can’t compress and strip whitespace in these, but once again, Pagespeed has several tricks to improve performance. We’ve already covered domain sharding, but there Pagespeed can take this further and actually optimise the images themselves.

If you are feeling brave, and your site uses images as icons (for example on Twitter and Facebook links, or other site-wide “furniture”), then ideally these should be used as sprites. However, some designers can find maintaining these a pain: a minor amend to one sprite might mean moving around many different ones to make room, and so require dozens of lines of CSS changed as absolute positioning needs tweaking. Not using sprites, while easier, means the user has to make a separate request for each sprite. A dozen sprites can therefore lead to a dozen extra performance-degrading connections. With a few tweaks to our design however, we can have Pagespeed do this for us using the sprite_images filter. Even without this though, there are some other filters that we can apply that will help the page performance:

What do these do? Well, there is detailed documentation available, but recompress_images is a group filter – that is, it comprises several different Pagespeed filters – that strip meta-data and optimise the image as required. The convert_jpeg_to_webp is useful if your stats show a good percentage of your users use Chrome or Android. WebP is a significantly optimised format created by Google, and we can only hope it gets wider support by other browsers in the future.

Scaling Up and Out

The sharp-eyed readers may have been wondering, if the Pagespeed assets are generated with a unique hash, how does Pagespeed scale? Well, initially, it didn’t, but as of 1.1x, Pagespeed supports sharing assets via memcached. Additionally, we can set up per-virtual host caches; here’s an example virtual host which does this:

This setup allows us to use a memcached server running on port 11211 on the localhost, but this could very easily be a cluster of memcached servers. Note also that we’ve told mod_pagespeed to use the folder /var/cache/mod_pagespeed/example for this, allowing us to segregate data from one host to another.

Upstream or Downstream?

Returning to our sandwich topology of Web server -> caching layer -> application server, the question is: where do I put in Pagespeed? Up until recently, I would have instantly recommended behind the caching layer. However, as nginx’s own flavour of Pagespeed moves more and more into maturity, the possibility grows of having it entirely in front of the caching layer. The advantage of this is that there is no need to segregate your content in the cache into different hashes, therefore removing the need for Apache entirely and replacing it with nginx on both sides of the sandwich. However, having it behind means there is no latency on subsequent requests as the optimised content will be in the caching layer (if the request can be cached). As has been said time and again in this series of articles though, test your theory.

We’ve barely scratched the surface of what Pagespeed can do, but hopefully this can give an introduction to how it can save your development team time in having to create deployment procedures and workarounds for optimisation without having to change a line of code.