Reducing bandwidth use, decreasing page load times, for better Drupal user experience
One of my sites has a high traffic load (2000 visits per day, over 6000 page views per day) and has been using up the bandwidth allotment on the shared hosting account where it's hosted. Concerns are that a large download per page would turn off visitors due to a long page load time, and also the environmental impact of excessive bandwidth usage. Initially the only measurement tool I had was the realtime bandwidth statistics provided by the hosting provider (see the screenshots below) and it was only later when Firebug became functional on firefox 3.5 and YSlow was then usable. It turns out that Firebug and YSlow were the critical tool for this project, and that by using the following tips my site went from a 'D' YSlow score to 'B' and has drastically reduced bandwidth use.
This is the bandwidth charts over the last 30 days and 7 days of performing the changes. Note how the bandwidth use bounces around a bit in a daily cycle but is generally steady at 20kbytes per sec. This is the chart I ultimately want to reduce. Along the way I found some other things to decrease as well.
Aggregating CSS and JS files
A bit about YSlow use
To look at this using YSlow, look first at the 'Grade' tab and see that it breaks out each element of the overall grade as individual grades. A quick scan down the list shows the areas that need attention. Click on the Components tab to see details for each object required by the page. It shows a table broken down by the component type giving details about each one. This includes the uncompressed and compressed size, the URL, expires tags, etc. For most of this you'll be going between the Grade and Components tabs to see how to minimize the components to get a better grade.
Browser caching of files
The Expires header defines how long a given file can be cached in the browser. For files that rarely change setting an expiration in the distant future lets it be downloaded once and kept by the browser on the users' computer. YSlow's Components tab has a column showing the expiration date.
Optimizing page load times using mod_deflate, mod_expires, and ETag on Apache2 goes over use of mod_expires plus some other techniques.
In Drupal one example of an egregious situation improved by mod_expires are the editor widgets that can be used on the node add/edit form. My sites use BUEditor and it comes with about 10 little PNG files and without an Expires header each of these are loaded each time the node add/edit page is loaded.
In the grade for "Reduce HTTP Requests" YSlow might recommend to combine multiple images into sprites. The technique is to instead of load 10 small images to load one larger image, and use CSS and HTML techniques to address each subsection of the image. Unfortunately BUEditor's design does not lend itself to sprite usage. But installing an Expires header goes a long way towards solving this issue.
Turning on Expires support first requires ensuring mod_expires is enabled and then using these directives:-
ExpiresByType image/jpeg "access 30 days"
ExpiresByType image/gif "access 30 days"
ExpiresByType image/png "access 30 days"
The article linked above has more information about this.
Turning on compression
Compression -- Drupal side or Apache side or both? gives a very good overview of the way to enable page compression. The technique is to use a compression algorithm (such as GZIP) to compress the data sent from web server to the browser. It requires matching compression algorithms on each end and does require more CPU power to run the compression/decompression. Obviously the tradeoff is between CPU usage and bandwidth and page load times.
Note that Drupal's "Optimize CSS" and "Optimize JS" options under admin/settings/performance does not gzip. As I said above optimizing those files did not decrease total bandwidth use, but did decrease the number of individual files downloaded.
In the Components tab you can inspect compression levels by comparing the 'Size' and 'Size GZIP' columns.
Turning on compression increases CPU load on the server with the possibility that the server's CPU power would be used up performing compression. It's worth measuring whether it makes sense for your site whether to compress or simply use more bandwidth.
It is important to compress only once. Compressing a compressed file doesn't make an even smaller file, but instead makes a larger result. There are two places compression can be enabled, either in Drupal (using PHP's GZIP support) or in the web server using mod_deflate in Apache.
To use mod_deflate use something like the following which is adapted from the discussion linked above. Also see the documentation here: http://httpd.apache.org/docs/2.2/mod/mod_deflate.html
This enables compression for the given content types. Image files aren't being given compression because they are often already compressed, or the benefit is very small. To demonstrate the effect of compression is simple using the gzip command line tool to compress different files to see the effect.
% ls -l greenhouse-gas.jpg
-rw-r--r-- 1 davidherron staff 44849 Aug 24 20:29 greenhouse-gas.jpg
% gzip greenhouse-gas.jpg
% ls -l greenhouse-gas.jpg*
-rw-r--r-- 1 davidherron staff 43875 Aug 24 20:29 greenhouse-gas.jpg.gz
% ls -l screenshot-drupal.org.png
-rw-r--r-- 1 davidherron staff 27127 Oct 29 2007 screenshot-drupal.org.png
% gzip screenshot-drupal.org.png
% ls -l screenshot-drupal.org.png*
-rw-r--r-- 1 davidherron staff 27176 Oct 29 2007 screenshot-drupal.org.png.gz
% ls -l perm\ pavement.pdf
-rw-r--r--@ 1 davidherron staff 40381 Jul 5 19:05 perm pavement.pdf
% gzip perm\ pavement.pdf
% ls -l perm\ pavement.pdf*
-rw-r--r--@ 1 davidherron staff 37579 Jul 5 19:05 perm pavement.pdf.gz
% ls -l system.css
-rw-r--r-- 1 davidherron staff 10020 Jan 9 2008 system.css
% gzip system.css
% ls -l system.css*
-rw-r--r-- 1 davidherron staff 2859 Jan 9 2008 system.css.gz
% ls -l jquery.js
-rw-r--r-- 1 davidherron staff 31089 Jun 25 2008 jquery.js
% gzip jquery.js
% ls -l jquery.js*
-rw-r--r-- 1 davidherron staff 15710 Jun 25 2008 jquery.js.gz
% ls -l tabledrag.js
-rw-r--r-- 1 davidherron staff 39171 Jun 18 05:24 tabledrag.js
% gzip tabledrag.js
% ls -l tabledrag.js*
-rw-r--r-- 1 davidherron staff 10360 Jun 18 05:24 tabledrag.js.gz
Minifying CSS and JS files
Minification means to squeeze out all the white-space in a file so that it's semantically the same. The web browser doesn't care about human readability and is able to grok CSS or JS files that lack whitespace just as readily as it groks the human readable ones. Turning on the CSS and JS optimization in Drupal takes care of this issue.
Unfortunately some themes include some CSS or JS directly in the theme file. This CSS or JS does not get optimized and YSlow might complain about that. Sorry, there's not a lot you can do about that without hacking the theme.
Configuring ETags (entity tags)
YSlow complains my site doesn't have ETags and gives it a D on this score. I've tried reading the Apache documentation about this several times and it just doesn't make any sense. In Optimizing page load times using mod_deflate, mod_expires, and ETag on Apache2 he documents this method to turn off ETags and supposedly YSlow will shut up about this.
Use a Content Delivery Network (CDN)
CDN's are good for high traffic sites because it can distribute the files out to servers that are "near" the users (in terms of network topology). However CDN's are not within my budget of approximately $0, and further Drupal makes it nigh on impossible to use CDN's. There are some contributed modules to enable CDN usage and I've never used these. I'm simply ignoring the 'D' given to me by YSlow on this attribute.
Enabling parallel loading
As noted above web browsers are generally unable to load multiple files at the same time. However they can be tricked to do more through the use of multiple subdomains. The Parallel module makes this possible if you set up three subdomains of your domain. It farms out the file requests to these subdomains, enabling the browser to request more files at the same time. I haven't tried this. It is not a trick which will decrease bandwidth use, but it will decrease the page load time by allowing the browser to do more things in parallel.
There are other things YSlow measures but Drupal already handles most of them well.
Of the above items it is enabling compression that did the most good. I first turned on Expires headers 2-3 weeks before turning on compression, watched the bandwidth usage and did not see any noticeable decrease. However turning on compression gave an instant and dramatic decrease in bandwidth use.
This was taken about 1 hour after enabling compression. You'll note the immediate dive in bandwidth use.
This is the 30 days view after two weeks. It's clear the immediate bandwidth decrease has held true over a long period. Further looking at Google Analytics data shows the number of page views has if anything increased (slightly) over the time period.
Finally this is the 7 day view taken two weeks after the change. This shows the general bandwidth use has dropped from 12-15 kbytes per second to 7 kbytes per second, and further the aggregate bandwidth use reported by the hosting provider is nowhere near being exhausted.