[Xapian-discuss] NFSv4 and locking

Olly Betts olly at survex.com
Mon Dec 8 01:21:57 GMT 2008


On Wed, Dec 03, 2008 at 09:51:43AM -0800, Peter A. Friend wrote:
> I've only had problems with lockd, I don't use it. Using a lock file is 
> more reliable. If it's mounted v3 EXCL|CREATE should work just fine, if 
> it's v2 you will probably need to use something fancier like a symlink 
> to a uniquely named temporary file.

It's not hard to create a lock file in an NFS-safe way (open(2) on Linux
describes how in the O_EXCL section).

We used to use this with quartz, but the big flaw in this approach is
that if the process exits uncleanly, the lock file is left behind.
That's the key reason why we changed the approach for flint, and it
was a big improvement and I really don't want to go back.

In the NFS case, you really can't tell if a lock is live or not.  The
best approach I've heard is for the process holding it to touch it
periodically, but for Xapian we'd have to have a "toucher" thread to do
that, and it means there's a delay after a crash before you can tell the
database isn't in use.  There's also a problem if the system load climbs
and the toucher doesn't get a look in for too long, and issues with
clock synchronisation between the NFS server and clients.

For a local database, you can store the hostname and PID to the lock
file, but the PID may get reused in the meantime causing a stale lock to
appear fresh, and it's not easy to remove the stale lock in a non-racey
way.

As a result, people would do "rm dbdir/db_lock" and then they'd
occasionally get a corrupted database when there was actually another
process using it.

FWIW, I've not experienced problems with NFS locking myself (other than
it not being supported when lockd isn't run).

Cheers,
    Olly



More information about the Xapian-discuss mailing list