[Xapian-discuss] Xapian across multiple servers

Olly Betts olly at survex.com
Wed Jul 23 04:08:12 BST 2008


On Tue, Jul 22, 2008 at 06:21:46PM +0100, Richard Boulton wrote:
> Olly Betts wrote:
> >> 2) Run Xapian locally on each front-end webserver, but storing the
> >> index on shared storage.  This will be I/O intensive, but doesn't
> >> involve syncing changes to out to each front-end.
> > 
> > Probably OK if the search load is low.  A high search load on a large
> > database will probably get ugly.
> 
> Actually, even for a reasonably high search load, if the Xapian database 
> is very rarely changed this could work well if the database is small 
> enough to get fully (or largely) cached in memory.

You'll suffer after any update with a high search load since it'll be a
completely cold cache start (for the reasons you give in text I didn't
quote) but with the search load full on from the start, so searches will
take longer and you'll get more running concurrently, so they'll take
longer, etc.  If you can arrange to do the update when the search load
is low, it'll probably be OK.

> Which reminds me - I must update the documentation of the replication 
> stuff in SVN.  There are currently two documents about it: 
> xapian-core/docs/replication.rst is an overview of how and why to use 
> it, and xapian-core/docs/replication_protocol.rst covers (some of) the 
> internals of the replication system.

Available online for anyone curious:

http://trac.xapian.org/browser/trunk/xapian-core/docs/replication.rst
http://trac.xapian.org/browser/trunk/xapian-core/docs/replication_protocol.rst

> I think a good chunk of the 
> replication.rst documentation should probably move to admin_notes.rst 
> (in particular, the "Alternative approaches" section probably belongs 
> there, since it's not really about the replication stuff).

It isn't about the replication feature as such, but it is about the
topic of "database replication" and illuminates the design decisions in
the replication feature so I think it's reasonable to cover it there.

It doesn't seem to fit particularly well in admin notes either.
Currently that says:

   The intended audience is system administrators who need to be able to
   perform general management of a Xapian database, including tasks such
   as taking backups and optimising performance.  It may also be useful
   introductory reading for Xapian application developers.

Replication seems to be more something you want to consider when
designing the system rather than "general management".

So I'd suggest mostly leaving it where it is but adding a link from the
admin document (since a sysadmin should at least be aware the feature
exists).

The NFS caching issue is rather relevant to admins, especially if the
cache gets flushed even for the client modifying the database (as that
would mean putting the database on NAS isn't good).  It's probably also
be worth noting explicitly that rsyncing databases doesn't reliably
allow searching during the update.

Cheers,
    Olly



More information about the Xapian-discuss mailing list