[Xapian-discuss] Python bindings - xapian.Database.reopen

Richard Boulton richard at lemurconsulting.com
Tue Apr 14 15:35:29 BST 2009


On Tue, Apr 14, 2009 at 04:15:10PM +0200, Cedric Jeanneret wrote:
> Hello,
> 
> I'm using xapian in a pylons application, with pythons libs/bindings...
>
> My indexes are created on other servers, then rsync-ed to my search
> engine... It seems that sometimes this process do some mess, as my Pylons
> app returns a big error :

This is the problem - if you rsync a database which is being modified,
you'll get half the old database and half the new database.  It is not safe
to rsync a database which is in the process of being modified, because
rsync is not an atomic copy operation.  In fact, even if the database isn't
being modified, you'll get errors like the one you report if you try and
search while the rsync is happening (though at least in that case, once the
rsync is finished, the database should be valid again).

This is why the 1.1.0 release will have support for replication, in a safe
way.   See
http://trac.xapian.org/browser/trunk/xapian-core/docs/replication.rst for
details (it has a long section on alternative approaches to replication,
including rsync, which may interest you).  If you want to try this out, use
SVN trunk (which is very close to release, though no promises that we won't
need to change something at the last minute).

If you must rsync, you need to stop the indexer, take a copy of the
database on the client, rsync to update the copy, and then swap that copy
in place of the old database on the client.  Preferably, use a stub
database to control which database is live on the client, and use "rename"
to update that stub database file (rename, when used to move a file to
replace another file, is an atomic operation.  Unless you're on windows.).

> Error - xapian.DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
> [snip useless trace]
> DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation

This error is slightly misleading in this situation - in fact, due to the
rsync, your copy of the database is corrupt.

> Ok... so I'm trying to call xapian.Database.reopen().... but how ??
> 
> Trying to do so:
> try:
>   d = xapian.Database('my/db')
> except xapian.DatabaseModifiedError:
>   d = xapian.Database()
>   d.reopen('my/db')

Just to note; if the error had occurred due to local modifications, you'd
only need to call reopen() if you were re-using a database handle.  Here,
you're making a new database, so you just need to retry the operation.
It's academic, though, because the use of rsync has left you with an
invalid database which no amount of calling reopen() will fix.

-- 
Richard



More information about the Xapian-discuss mailing list