Search engines for PHP sites

Technology Blog

Having spent most of a day investigating different tools for site search I thought I’d post my experiences here for the benefit of others.

My objective was to find a tool which is free, or at least inexpensive, and which could be readily integrated with the rest of the site, with output suitable for styling with CSS. Until now as a stopgap I’d used a Google search form, but Google has the disadvantage that you can’t reindex on demand. I’ve also used Picosearch before on another site, but, as with Google, since the results are generated by an external server there is limited control over the look of them.

The first I tried was Fluid Dynamics Search Engine (FDSE) (update: in 2019 it’s now defunct). It’s quite a nifty little Perl-based tool, but from my point-of-view it has one major drawback, in that the format of the HTML it generates is hard-coded into the script, and entirely table-based. It can be restyled up to a point, but it’s not very flexible.

Integration ought to be easier with a PHP-based tool, so I consulted Google and built up a shortlist of Site Search Pro, PhpDig, FastFind, Sphider and Zoom.

Site Search Pro and FastFind only have online demos, which meant their integration features could not be tested. On that basis I eliminated them immediately.

There were several references to PhpDig being very slow at indexing on various forums and blogs, so I decided not to bother with it, at least until I’d tried the other options.

I downloaded Sphider and followed its relatively straightforward installation instructions. Although there were no problems in getting it working, I was unimpressed with its indexing because it took no account of the BASE element, which meant it generated incorrect URLs. That made it useless for my purposes.

Finally, I tested Zoom. It’s a sophisticated product, but for small sites (up to 50 pages) it’s free. It’s well worth getting to grips with its abundance of features because, as I found, with some imagination there’s usually a way to do what you want.

To run the indexer you need a Windows PC, but the actual search will run on most servers, as long as they’ll execute either PHP, ASP, or Perl scripts. You can also use it to index static files, useful if you are supplying documentation on CD for instance.

The indexer generates a set of binary files and the scripts to search them. The HTML generated by the scripts is fairly clean, with DIV and CLASS hooks for styling, so while, as with FDSE, you can’t actually customize the HTML itself, in this case it doesn’t really matter.

To integrate the Zoom PHP script into this site was quite straightforward, although I found I had to set an undocumented variable called $LinkBackURL to cope with the URL rewriting used by my code. I also had to insert some of the proprietary comments that Zoom uses to control its indexing, to ensure that common elements like menubars were ignored.

Once I had it working, I started playing with the features and observed that it has a search terms highlight facility in the results. Not only that, but with the help of some supplied javascript when you follow the link to a result the referred page will also show highlights and automatically scroll to the first one. Very cool.

Unfortunately the highlight script fails in IE 5.01 (one of the several browsers I use for cross-browser testing), but I managed to find a fix. I’ve sent it to the Zoom people at Wrensoft and hopefully they’ll incorporate it into their script.

In conclusion, if you are looking for an integrated site search tool, I reckon Zoom should go on your list.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>