[Xapian-discuss] Possible configurations

Eric Theise eric at godengo.com
Mon Sep 11 17:59:46 BST 2006


Good morning all,

We run a network of sites; they're separate entities, hosted on multiple
servers, and they require separate indexing.  Sites may be updated at any
time, and we'd like to be able to incrementally update our databases.  Our
systems are php-based, so it was easy to start off by using sphider, but we
seem to be on our way to outgrowing it.

I'm working through the xapian docs and the mailing list archive, but clues
are scattered across the years, and I'm interested in current thoughts about
best practice.

One option seems to be to crawl our sites with htdig, keeping all indexes on
a master server.  I'm setting up a system using htdig2omega this morning,
but at first glance, it seems as if we'd lose the ability to do incremental
updates this way.

The other option would be to keep the search facilities with each site, and
not use htdig or wget at all, and this seems like a better way to go if
server resources allow it.

I also wonder if flint is the predominant database format these days, or if
there are reasons to stay with quartz.

Thanks in advance for your input, Eric





More information about the Xapian-discuss mailing list