[Xapian-discuss] xapian-replicate errors

Olly Betts olly at survex.com
Fri Nov 16 22:16:05 GMT 2012


On Fri, Nov 16, 2012 at 01:15:58PM -0800, Kevin Duraj wrote:
> Then our Xapian implementation is incorrect and we need to correct it.
> Only not deleted document should be replicated.

I would certainly disagree with your assertion that the current
implementation is incorrect.  As with any non-trivial system there are
trade-offs in the design.  If you (or anyone else) think you can improve
on those trade-offs, I certainly encourage you to have a go, and I look
forward to reviewing your patches.

It's probably going to be tricky to implement what you're suggesting
without it being slower though.  If the initial full copy is going
to turn a 63GB database with 33GB of currently unused space at one end
into a 30GB database at the other, then it will effectively have to
compact the database on the fly.  That's not especially hard to do, but
the problem then is that you can't just record the new version of every
block written on the master and replay those writes on the replica,
because the blocks on the replica will be entirely different.

So you'd have to instead record changes at the key+value level, which
was an option we considered when designing replication.  The big drawback
is that on the replica you need to load the old version of each block to
apply changes.  With the current design you write an entire replacement
block, which reduces the disk I/O load on the replica substantially.

If you really have a database with 33GB of slack space you want to start
replicating, you can just to compact it first, then start to replicate
it.  But note that this isn't the situation which Denis actually had.

Cheers,
    Olly



More information about the Xapian-discuss mailing list