[Xapian-discuss] xapian-replicate errors

Kevin Duraj kevinduraj at gmail.com
Fri Nov 16 23:50:59 GMT 2012


Thank you for explanation. Now, I agree with you that block level replication is better because it takes less hard disk resources. 

I was wondering if would be bad idea to have a new "compact index replica" process that would read from fragmented index, and write new index that could be then copy and used as compact index replica. This way we could use current index while new index would be compacting.

Kevin Duraj
http://myhealthcare.com

On Nov 16, 2012, at 2:16 PM, Olly Betts <olly at survex.com> wrote:

> On Fri, Nov 16, 2012 at 01:15:58PM -0800, Kevin Duraj wrote:
>> Then our Xapian implementation is incorrect and we need to correct it.
>> Only not deleted document should be replicated.
> 
> I would certainly disagree with your assertion that the current
> implementation is incorrect.  As with any non-trivial system there are
> trade-offs in the design.  If you (or anyone else) think you can improve
> on those trade-offs, I certainly encourage you to have a go, and I look
> forward to reviewing your patches.
> 
> It's probably going to be tricky to implement what you're suggesting
> without it being slower though.  If the initial full copy is going
> to turn a 63GB database with 33GB of currently unused space at one end
> into a 30GB database at the other, then it will effectively have to
> compact the database on the fly.  That's not especially hard to do, but
> the problem then is that you can't just record the new version of every
> block written on the master and replay those writes on the replica,
> because the blocks on the replica will be entirely different.
> 
> So you'd have to instead record changes at the key+value level, which
> was an option we considered when designing replication.  The big drawback
> is that on the replica you need to load the old version of each block to
> apply changes.  With the current design you write an entire replacement
> block, which reduces the disk I/O load on the replica substantially.
> 
> If you really have a database with 33GB of slack space you want to start
> replicating, you can just to compact it first, then start to replicate
> it.  But note that this isn't the situation which Denis actually had.
> 
> Cheers,
>    Olly



More information about the Xapian-discuss mailing list