Technology Blog »
September 15, 2008
Over the weekend I created Limerickipedia as a bit of a joke and an exercise in working with MediaWiki. To get it running I had to overcome a number of obstacles so i thought I’d write them up briefly here.

Limerickipedia Home Page
The first problem I found was that the help pages don’t get installed automatically, even though the links to them do. This is very irritating and must have irked other new users. Since MediaWiki, like many other platforms, comes installed with a small amount of content it seems absurd not to include the help pages, especially as the process of installing them is such a chore.
The next issue is that the default is not to run with clean URLs. Being accustomed to the ease of setting these up in Drupal and WordPress, I was slightly disappointed to find that in MediaWiki you have to set up your .htaccess file by hand (why isn’t an example supplied with the install at least?).
There’s no adminstration backend and you have to edit settings in the LocalSettings.php configuration file by hand. Not terribly convenient and a bit old-fashioned but it works.
The template system needs a bit more work in separating logic and presentation. Big chunks of literal HTML in the middle of function definitions are ugly, hard to read and make maintenance harder.
When it came to settiing up Google Webmaster Tools I hit my first real hurdle. Here’s why: Webmaster tools has two verification methods by which it checks that you are an administrator with access to a site and hence entitled to see its stats (as opposed to a prying competitor or nosy parker).
One of the two methods involves uploading a file which has a secret name assigned to it by Google. In the process of verification this file is retrieved and then another, non-existent, file is requested. This second step is vital: the first request must succeed (with HTTP status 200) and the second fail (with HTTP status 404) to ensure that the special file really exists. If both succeed then the logical conclusion is that the server is responding to all requests with status 200 and there’s no way to tell whether the file was actually uploaded or not, and hence whether you really are the site administrator.
If Google did not perform the second check then a server behaving like this could be “verified” by anyone! And by default this is the way that MediaWiki is set up, with a non-existent page being treated as one not yet created, rather than as an error. This behaviour is fine for potential wiki pages, but not for real files, so the configuration needs a bit of a tweak to work with Webmaster Tools.
The remedy, for Apache users at least, is fairly simple. Here’s a working .htaccess file, adapted from one set up for clean URLs:
ErrorDocument 404 /404.html
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !^/(skins|stylesheets|images|config)/
RewriteCond %{REQUEST_URI} !^/(redirect|texvc|index).php
RewriteCond %{REQUEST_URI} !^/(noexist|google|404).*\.html
RewriteRule ^(.*)$ /index.php?title=$1 [L,QSA]
The two bold lines are the ones I added to solve the problem. I also had to create a file called 404.html, purely for Google’s benefit. In summary, this solution ensures that the URLs requested by Google bypass MediaWiki altogether, and that there is a document returned when a 404 is issued.
(I know I could have used the alternative verification method, of adding an extra meta header to the page, but it seems the wrong answer to add that overhead to every response.)
Despite the above gripes and hassles, once set up MediaWiki runs very smoothly and is a very neat piece of software. I doubt that Limerickipedia will ever outshine its big brother, though!