Setting up an HTTP-accessed Mercurial repository on shared web hosting (e.g. Dreamhost)

The goal of my previous two blog entries was to access my mercurial repository via http (to clone the repository) and via ssh (to commit changes into the repository). The next step in my journey was setting up the HTTP access.

It's possible to rent web service for mercurial hosting (see: MercurialHosting) but in my case I'm already renting this dreamhost account and I want to use its oodles of disk space to stash stuff which I can retrieve from "anywhere". FWIW dreamhost offers subversion hosting as one of their "goodies" but after 1+ years using it, well, let's just way I desperately want to use something else.

Mercurial has a built in web server for serving repositories over HTTP but it's impossible to use this on a shared hosting account, and in any case they make it clear the hg serve server is suboptimal.

PublishingRepositories makes it clear how to use the two CGI scripts provided with Mercurial. The first, hgweb.cgi, supports only a single repository. The other, hgwebdir.cgi, supports multiple repositories and is more flexible. I'm going to use the second one as it's clear (to me) I'll be using multiple repositories.

In addition to PublishingRepositories it's useful to consult HgWebDirStepByStep.

The first step is to create a subdomain and website to house the mercurial repository. The subdomain could be hg.example.com (mine is housed at hg.davidherron.com). Most hosting providers have a control panel that lets you create domains, subdomains, websites, etc, and it's here that you set it up. This requires both the subdomain by which the repository is referred to (hg.example.com) and the webserver configuration to enable HTTP access to the domain. The Mercurial documentation discusses setting up virtual hosting in an apache httpd.conf file, however with shared hosting the control panel takes care of configuring the virtual hosting for you.

Your webserver must be configured to allow CGI execution.

The next step is to retrieve the CGI file. It is either in the source distribution or you can retrieve it from the Mercurial web site using this URL: http://www.selenic.com/repo/hg-stable/raw-file/tip/hgwebdir.cgi

$ cd docroot
$ wget ">http://www.selenic.com/repo/hg-stable/raw-file/tip/hgwebdir.cgi

It's important to put this CGI script in the directory set up to store the files for your chosen subdomain (hg.example.com). In the hosting provider control panel, when you create the subdomain it will also tell you a pathname on the server's file system corresponding to this subdomain. It's this directory where you place the hgwebdir.cgi file.

Again, your webserver must be configured to allow CGI execution. Additionally it must treat files ending with ".cgi" as a CGI. This detail will be configured in your hosting provider control panel and, indeed, it may be the default when the domain is configured.

The next step is to rename the file more appropriately

$ mv hgwebdir.cgi index.cgi
$ chmod +x index.cgi

The name index.cgi is more appropriate and makes more sense in a URL. It's purely arbitrary of course but that the population out there understands "index.html" and the like as the main URL for a directory. The CGI script also has to be executable.

The next step is to tweak the script a little based on your own needs. A lot will depend on specifics of your web hosting arrangement.

The first line of the script ensures it is executed by python. It's essential that this python be the one into which mercurial has been installed. If you installed a local python and mercurial as I did then the script will have to read something like this:-

#!/usr/bin/env /home/reikiman/bin/python

The path /home/reikiman would be wherever you installed python.

A couple lines into the script is # adjust python path if not a system-wide install: If, as I did, your mercurial is not installed in the system-wide python you'll have to take action here as well. Uncomment the following two lines and edit similarly to this:-

# adjust python path if not a system-wide install:
import sys
sys.path.insert(0, "/home/reikiman/lib/")

A few lines further down is a couple lines which allow pages to be encoded with UTF-8. I believe that's a good idea and have uncommented them in my setup.

The next step is to create the hgweb.config file. It's syntax is briefly described in the CGI script and HgWebDirStepByStep has more detail. The initial configuration to put in this file is to initialize the repository (repositories) to serve.

The file should contain

[paths]
repo1 = path/to/repo1
repo2 = path/to/repo2
...

It's probably simplest to put the repository right in the docroot such that your config file might read:-

[paths]
drupalv5 = drupalv5
drupalv6 = drupalv6
...

The left-hand-side of each line is the name mercurial will show for the repository, while the right hand side is the path where mercurial will find the repository.

With this much set up you should be able to visit, with your browser, <a href="http://hg.example.com/path/index.cgi/repo1">http://hg.example.com/path/index.cgi/repo1</a> and see the repository in your browser. Additionally you should be able to retrieve the repository this way:-

$ hg clone ">http://hg.example.com/path/index.cgi/repo1

It may be possible to do server config such that the "index.cgi" is not required in the URL. On PublishingRepositories is given a snippet of code that can be used in a .htaccess file, however I just tried this and it did not work.

extvideo: