Database left unlocked by Tcl bindings

Olly Betts olly at survex.com
Thu Feb 25 23:37:52 GMT 2016


On Thu, Feb 25, 2016 at 05:21:17PM +0100, Eric J wrote:
> On Thu, 25 Feb 2016 02:24:51 +0000, Olly Betts <olly at survex.com> wrote:
> > It's clearly not as simple as execl() always releasing the lock, but I
> > don't think we've ruled out the OS entirely yet - the above isn't
> > exactly equivalent to the Tcl code, as the two databases are created by
> > the same process in Tcl but different processes with simpleindex.
> 
> but the same problem happens from two different Tcl processes - both
> succeed because there is no lock.

Ah, OK - I missed that detail.

> Finally, it appears that it does work with Tcl 8.5 (actually a tclkit,
> but does not work with an 8.6 tclkit).

I'm testing with Tcl 8.6 (Debian package 8.6.4+dfsg-3), and it works for
me.

So it does seem it must be due to something your Tcl interpreter is
doing, but I'm struggling to think what that could be.

If O_CLOEXEC was set on the lock fd when execl() was called, the fd
would get closed and the lock released.  But your lsof shows the fd open
but not locked in the child process after it has exec-ed cat.

If there were a second fd open on the lock file which gets closed
in the child process after the lock is taken, that would release the
lock.  But we carefully close all other open fds before taking the
lock to avoid that.

Cheers,
    Olly



More information about the Xapian-discuss mailing list