How to use Mercurial to hack Drupal Core without killing kittens

Date: Thu Jan 14 2010 Drupal »»»» Mercuial »»»» Drupal Planet
In the Drupal community there's a paradigm that every time you hack core you kill a kitten. And who would want to kill a kittens? The point is that any time you hack core (modify the Drupal core files) it becomes a nightmare to forward-migrate your changes as Drupal core is updated. The Drupal team does routinely update Drupal (about every 2 months) and it's best to keep your Drupal installation up-to-date especially as many of the fixes are for security bugs.

I have developed a method to hack drupal core without killing kittens. It is very simple and relies on using Mercurial to maintain your own source tree, and to use the mercurial patch queue functionality to maintain your patches.

The Mercurial Queues extension maintains a set of local patches. It's derived from the 'quilt' application someone or other wrote a long time ago in a kernel development team far far away to enable keeping local patches to the Linux kernel.

Enable the extension by adding this to your .hgrc file:


[extensions]
hgext.mq =

The Mercurial Queues home page goes over how to use this feature. I'll show here how to set it up for use with Drupal.

First begin with an unpacked Drupal install.


wget -O- http://ftp.drupal.org/files/projects/drupal-6.14.tar.gz | tar xvfz -
cd drupal-6.14/sites/all/modules
download other modules
cd ../../default
cp default.settings.php settings.php
chmod 666 settings.php
mkdir files
chmod 777 files
set up drupal
chmod 444 settings.php
cd ../..
hg init

In other words you set up a Drupal install. And then perform "hg init" inside that Drupal install to convert it into a Mercurial repository. This has some interesting benefits because of how Mercurial allows for distributed development to occur. For example you could have a common repository of Drupal and contributed modules, and use that common repository to run several Drupal installations.

Setting up a new Drupal site could be done as so:-


create a docroot directory for the new site (e.g. using cPanel or whatever admin tool you use)
cd to the docroot
hg clone ssh://..../drupalv6 -- to check out a copy of the workspace
mv drupalv6/* drupalv6/.h* .
rmdir drupalv6 

The last two lines are required because "hg clone" puts the files into a subdirectory named "drupalv6" (or whatever your parent repository name is) and what you want is for those files to be in the docroot home directory. The 'mv' command moves the files into the docroot home, and the rmdir command removes the unneeded directory.

Now you have a docroot directory containing a Drupal instance managed by mercurial, and it has a parent mercurial repository. You could check out as many instances of this repository as you like. For example to have a development server installation and a production server installation.

Next let's initialize the mercurial queues extension for a particular instance.


hg qinit

After running this command you'll see in the ".hg" subdirectory a new directory named "patches" which is itself a mercurial repository. That sub-repository is where the patches we'll be creating live.

Now let's consider the development and production server installations. There's many kinds of patches one might want to have between them. An obvious patch is the sites/default/settings.php file.


cd development-server-docroot
cd sites/default
hg qnew settingsPhp
vi settings.php
hg diff
hg qrefresh
hg diff

The "hg qnew" command starts a new "patch" named in this case "settingsPhp". Once you modify the settings.php file running "hg diff" shows some differences. Running "hg qrefresh" causes all outstanding differences to be added to the current patch. Running "hg diff" again shows there are no differences, because the differences have been saved in the patch.


hg qpop -a -f

This undoes all the currently applied patches. If you compare settings.php before and after running "hg qpop" you'll see that it reverts to the original file.


hg qpush -a -f

This applies all the available patches. Again comparing settings.php before and after shows that it now has the change again.

One thing I always have to hack is the .htaccess file because one of the hosting providers I use puts the default memory size at a skimpy 32M, whereas 64M is required for my drupal install to work.


cd production-server-docroot
hg qnew htaccess
vi .htaccess -- add: php_value memory_limit 64M
hg qrefresh
cd site/default
hg qnew settingsPhp
vi settings.php
hg qrefresh

That just created two patches on the production server. The first was the modification to .htaccess to set the memory limit, the second was whatever settings.php changes are required for the production server.


hg qseries
hg qapplied

The first of those shows the full list of patches. The second shows the ones which have been applied. You can select carefully which patches are applied or not at a given moment.

Now suppose you've found a bug in drupal core, or maybe a performance improvement, and you want to apply it.


cd development-server-docroot
cd modules/xyzzy
hg qnew myCrazyPatch
vi some files
hg qrefresh

You've created a new patch (myCrazyPatch) and saved the differences away in a patch. You can test your differences and any time you want a new "hg qrefresh" command will refresh the patch file with the outstanding changes.

When you're thinking it's ready to deploy on the production server:-


cd development-server-docroot
cp .hg/patches/myCrazyPatch /tmp/myCrazyPatch
cd production-server-docroot
hg qimport /tmp/myCrazyPatch
rm /tmp/myCrazyPatch
hg qpush -a -f

This copies a patch from one repository to another. Perhaps this is a deficiency in mercurial but there isn't an organized way to tell mercurial to copy the patch from one repository to another. But the patch is stored as a plain text file in the patches repository directory, so it's trivial to copy the patch elsewhere and then use "hg qimport" to add it to a different repository.

Now what happens when a new version of Drupal comes along? This is the point where the normal Drupal developer would have killed a kitten because they have to make sure that the upgrade process does not kill their patches.

However by following the above steps the mercurial repository has an unpatched copy of drupal core in it. Remember that your patches are in the patches repository, not in the main repository. Cloning a fresh repository gets you a fresh and unpatched version of Drupal.


cd development-server-docroot
hg qpop -a -f
cd ..
wget -O- http://ftp.drupal.org/files/projects/drupal-6.15.tar.gz | tar xvfz -
(cd drupal-6.15; tar cf - .) | (cd development-server-docroot; tar xvf -)
cd development-server-docroot
hg add
hg ci
hg push
hg qpush -a -f

Let's take this step-by-step. We've already seen how "hg qpop -a -f" removes all the patches. This ensures the development repository is pristine. The next couple of lines retrieve the next Drupal release from drupal.org, unpacks it, and copy those files over the top of the development server repository. Next go back into the development server repository and "hg add" makes sure that any new files added to drupal are added to the repository, then "hg ci" commits the revisions bringing it to the next drupal release, and "hg push" sends those revisions (as a patchset) to the parent repository. Finally "hg qpush -a -f" reapplies the local patches you have.

The final "hg qpush" may fail if one of the patches fails to apply. This can happen if the drupal project changes something you changed, for example if they decided to make the same patch you made. If a patch fails you can either remove the specific patch or fix it and use "hg qrefresh" to fix the patch.

The way this works is that at the moment of copying the new drupal release over the old drupal, it is pristine unchanged drupal. Your patches are safely tucked away somewhere else. Those local patches are then reapplied with the "hg qpush" command.

Accommodating local changes to a contributed (noncore) module are handled the same way.