[Xapian-devel] Full MVCC in Brass

Adam Sjøgren asjo at koldfront.dk
Wed Aug 20 13:37:53 BST 2014


Olly Betts <olly at survex.com> writes:

> On Wed, Aug 20, 2014 at 10:22:57AM +0200, Adam Sjøgren wrote:

>> This would also make it easier to spool a consistent backup to tape of
>> an index that is being used (updated), right?

> If you mean for copying at the filesystem level, not really.

That was what I was thinking (hoping), yes.

> If you had a read lock held for all the time the backup was happening,
> you should ensure the backup contained enough information to recreate a
> working database from, but there's still no guarantee the backup itself
> would be a valid database.

So I couldn't do the logical equivalent of "restore the index to the
state at the time the read lock was taken", if I held that read lock
during the entire filesystem copy?

I guess I am thinking of this as the same as if the writer died while I
was holding a read-lock, and the writer's changes weren't "committed"...

[...]

> For a large and actively updated database, keeping the revision around
> while it gets backed up could inflate the database size quite a bit.
> That space would get reused after the backup is taken though.

Yeah, but that is almost certainly less than what I am doing now:
replicating to a copy, stopping the replication while backing up the
copy, then starting replication it again.

With that approach I have 2-3x disk-usage (2x constantly, some for
changesets, and then 1x extra when the backup takes longer than I have
changesets for, and the replication starts over from scratch).

> Another approach to allowing backups of a live database would probably
> be to make use of the replication changesets.  If you just backup the
> database at the filesystem level (without worrying about locks) then
> also backup any changesets created while the backup was running (and the
> next one created afterwards), then you should just be able to replay the
> changesets onto the database to restore.

That would work? Hm. My "mental image" of how these things work is far
from perfect :-)

So I could just set XAPIAN_MAX_CHANGESETS to a sufficiently high value
and backup the entire index-directory (plus the newest changeset-file),
and I could restore that and have a consistent index?


  Best regards,

    Adam

-- 
 "HENCE, I think that all UNIXs should have an EMACS,         Adam Sjøgren
  and everybody should run UNIX!"                        asjo at koldfront.dk




More information about the Xapian-devel mailing list