[Xapian-devel] Database Locking
Olly Betts
olly at survex.com
Sun Jan 24 06:48:52 GMT 2010
On Sat, Jan 23, 2010 at 02:27:42AM +0000, Jeff Stearns wrote:
> I know nothing about the internals of Xapian, but I wonder whether this
> cunning plan is more complex and expensive than necessary.
>
> I wonder why Xapian doesn't apply the flock, then set a flag indicating that
> the database is locked.
How do you track such a flag?
You can't use the database path, because symlinks (and also bind mounts on
Linux) mean that the same database could be opened via entirely different
paths.
You could try to use the device and inode number of the lock file. Not all
filing systems have the concept of inodes, so you couldn't support those FSes
though maybe all those that support fcntl() locking have inodes.
There also seems to be a race condition here - you have to use stat() to get
the device and inode (you can't open the lock file and call fstat() because
then if it is locked, you will unlock it when you close it), but then between
you checking the file with stat, another process could remove and recreate the
database, changing the inode of the lock file. I guess you could have a second
file which is the "inode of the database" and open that then use fstat() on the
fd.
I have also wondered about just locking based on the uuid of the database
(now that databases have uuids). This means if you copy a database, you
couldn't update the old and new version at once (without forcing a new uuid)
which is probably enough to sink the idea, but at least it fails in a safe
way (can't update when it would be safe, rather than allowing update when it
would be dangerous).
> Now whenever Xapian goes to open a database,
> it would first check whether the flag is set. If so, Xapian knows that that
> the database is already open within this process. If the flag is not set,
> Xapian continues onward, probing the file with flock to see whether it's
> open within some other process.
You are still vulnerable to user code opening and closing the lock file
thus unlocking the database. As an example of this, consider a file system
indexer which uses Xapian and ends up accidentally crawling its own database
directory - at LCA earlier this week, Carl Worth said this was something he'd
managed to accidentally do with notmuch, so this isn't just a theoretical
worry.
There doesn't seem to be any workaround for this issue, short of a separate
process. It's just how fcntl() is defined to work.
> This should work in a threaded environment so long as the customary
> synchronization primitives are used to avoid race conditions.
That would require us to include threading code in Xapian, which was painful
to do portably last time we had any. But that was years ago, so hopefully
that's no longer the case.
If you really think you have a better scheme for locking, probably the best
way forward is to code up a prototype (either as a patch to the current code
or a standalone example) and then we can check that it works on various
platforms and FSes, and see how the performance differs. At some point we
are likely to want readers to lock revisions too, so locking performance is
likely to become more important.
Cheers,
Olly
More information about the Xapian-devel
mailing list