[Xapian-discuss] swapping database mid replication

Richard Boulton richard at tartarus.org
Fri Mar 4 19:03:33 GMT 2011


On 4 March 2011 18:19, Wes Chow <wes.chow at s7labs.com> wrote:
> The Xapian documentation talks about a situation where you don't want to do
> two database swaps while in the process of replicating. It says...
>
> To confuse the replication system, the following needs to happen:
>
> 1. Start with two databases, A and B.
> 2. Start a replication of database A.
> 3. While the replication is in progress, swap B in place of A (ie, by moving
> the files around, such that B is now at the path of A).
> 4. While the replication is still in progress, swap A back in place of B.
>
> What if we omit step #4? Does that mess up replication on the client as
> well? Specifically, the situation I'm thinking of is this:
>
> 1. Start with database A
> 2. Start replication of database A.
> 3. Compact A to B.
> 4. While replication is in progress: mv A A.orig && mv B A
>
> How does this impact the replicating client? Do queries on the client during
> the replication act on A, or some meld of A and B?

Firstly, replication is always done in atomic steps as far as
executing queries on the client is concerned; the client never sees a
meld of two versions of a database.  (Note; each changeset is applied
as a step, so there may be several of updates in a single replication
run, but each step is applied atomically.)

Secondly, I wouldn't recommend swapping databases like that in any
situation; even without replication, Xapian may get confused when
attempting to open the database, opening some of the old files and
some of the new files, causing an error to occur.  Instead, you should
use a stub database to point to the current database you want to use,
and atomically replace it with a new version when you want to change,
using a single mv.  (See http://xapian.org/docs/overview.html ,
section "Specifying a database").

So, the interesting part; what happens if you do do the steps you
list, anyway.  The possibilities are as follows (depending on exactly
when you do the move):

 - Replication completes successfully, producing a clone of A on the
client.  The next replication run would replace that with a clone of
B.
 - Replication completes successfully, producing a clone of B on the client.
 - Replication reports an error, stating that a file is missing from
the database, or that the remote database or changesets are corrupt in
some way.  The client will be left with a valid database, but not the
same.
 - A fourth possibility is that, by coincidence, the database B
contained some files which happen to have the same version numbers as
the files in database A.  There are some safeguards against this
happening, but the replication code (and the rest of Xapian) isn't
designed to work if the database is changed underneath it, and these
safeguards are not foolproof, so I believe there is some risk that
this will result in a corrupt database at the client.

(In writing this email, I've realised that we could make the
replication slightly more foolproof by adding the uuid of the database
to each changeset file.  The client could then compare this with the
uuid it was expecting, and give an error if they differ.  This would
prevent a changeset from the B database being applied to a copy of the
A database.  However, there would still be potential problems if a
full database copy happened after the replacement of A with B.)

-- 
Richard



More information about the Xapian-discuss mailing list