Database left unlocked by Tcl bindings

Eric J eric at deptj.eu
Wed Feb 24 15:30:55 GMT 2016


On Wed, 24 Feb 2016 03:17:35 +0000, Olly Betts <olly at survex.com> wrote:
>On Mon, Feb 22, 2016 at 12:26:27PM +0100, Eric wrote:
>> On Sun, 21 Feb 2016 22:33:22 +0000, Olly Betts <olly at survex.com> wrote:
>>> On Sun, Feb 21, 2016 at 02:15:25PM +0100, Eric J wrote:
>>>> I discovered, while trying to set up Tcl bindings for Notmuch
>>>> (https://notmuchmail.org/), which uses Xapian, that flintlock was not
>>>> being locked (I had lost updates).
>>> 
>>> It seems to work for me, testing with this:
>>> 
>>> package require Tcl 8
>>> package require xapian 1.0.0
>>> xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN
>>> xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN
>> 
>> 
>> eric at bruno [ ~ ]$ cat /proc/version
>> Linux version 3.13.300 (root at bruno) (gcc version 4.8.2 (GCC) ) #2 SMP
>> Tue Sep 16 21:01:43 BST 2014
>> eric at bruno [ ~ ]$ tclsh
>> % info patchlevel
>> 8.6.1
>> % package require Tcl 8
>> 8.6.1
>> % package require xapian 1.0.0
>> 1.2.18
> 
> I've tested with 1.2.18 and can't reproduce this with that version
> either (is that also the version of xapian-core you're running?  The
> 1.2.18 above is the bindings version I think).
> 
>> % xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN
>> _e0c4b00000000000_p_Xapian__WritableDatabase
>> % xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN
>> _f0d3b00000000000_p_Xapian__WritableDatabase
>> %
>> 
>> At which point
>> 
>> eric at bruno [ ~ ]$ lsof tmp.db/flintlock
>> COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
>> cat     13543 eric    5w   REG    8,9        0 930437 tmp.db/flintlock
>> cat     13552 eric    9w   REG    8,9        0 930437 tmp.db/flintlock
>> 
>> Blaming the execl is due to stepping though my copy of the lock code in
>> gdb, and seeing, in lsof, 5w on the open, still 5w on the fork, 5ww on
>> the fcntl, and 5w again on the execl.
> 
> Odd, as you said elsewhere, execl() shouldn't drop the lock.  It would
> be good to get to the bottom of this, as unreliable locking is a bad
> thing to have.
> 
> What FS are you running this on?

ext4

> Is use of Tcl actually a factor here, or can you reproduce it with
> just C++ code?
> 
> E.g. using the "simpleindex" example from the xapian-core sources:
> 
> examples/simpleindex tmp.db &
> examples/simpleindex tmp.db

  lfs at bruno [ /usr/src/sources-deptj/xapian-core-1.2.18 ]$ examples/simpleindex tmp.db &
  [1] 26157
  lfs at bruno [ /usr/src/sources-deptj/xapian-core-1.2.18 ]$ examples/simpleindex tmp.db
  DatabaseLockError: Unable to get write lock on tmp.db: already locked
  
  [1]+  Stopped                 examples/simpleindex tmp.db

so it is presumably not anything to do with the FS or the OS. I am
hoping that the right Tcl person (whoever that is) may pick something up
in an strace.

> More recent Xapian versions will try to use the new OFD locks and avoid
> the need to fork() and execl(), so will presumably avoid whatever is
> going on here.  But the OFD locks were added in Linux 3.15, so your
> kernel isn't quite new enough.

Yes, I saw that, and it is good, but my chances of moving up soon are
not good. And I would like to get to the bottom of this anyway.

Thanx,

Eric
-- 
ms fnd in a lbry



More information about the Xapian-discuss mailing list