[Xapian-devel] Problems with /bin/cat and flintlock?

Samuel Williams space.ship.traveller at gmail.com
Wed Sep 7 16:27:46 BST 2011


Hi,

It seems like the latest patch has fixed the issue. I'm now running with 1.2.7 and don't seem to have the same deadlock in waitpid on either of my servers.

I'm happy that the problem has been fixed, but even thought I did spend some time working on this, I couldn't figure out how to create a synthetic test that revealed the problem.

Kind regards,
Samuel

On 2/06/2011, at 12:58 AM, Olly Betts wrote:

> On Sat, May 28, 2011 at 02:04:32AM +1200, Samuel Williams wrote:
>> In my synthetic test, I got the following behavior when attached to /bin/cat:
>> 
>> # strace -p 2091
>> Process 2091 attached - interrupt to quit
>> read(0, 0x89e7000, 32768)               = ? ERESTARTSYS (To be restarted)
>> --- SIGHUP (Hangup) @ 0 (0) ---
>> read(0, "", 32768)                      = 0
>> close(0)                                = 0
>> close(1)                                = 0
>> close(2)                                = -1 EBADF (Bad file descriptor)
>> exit_group(0)                           = ?
>> Process 2091 detached
>> 
>> This seems like the desired behavior.
> 
> So in this case, the restarted read() is reporting EOF on stdin.
> 
>> However, in the case where the WritableDatabase hung, I got the
>> following behavior:
>> 
>> # strace -p 25694
>> Process 25694 attached - interrupt to quit
>> read(0, 0x859b000, 32768)               = ? ERESTARTSYS (To be restarted)
>> --- SIGHUP (Hangup) @ 0 (0) ---
>> read(0, 
>> ^C <unfinished ...>
>> Process 25694 detached
> 
> Whereas here it doesn't.
> 
>> Well, I'm not sure why there is a difference. Both processes were
>> ignoring SIGHUP according to /proc/$pid/status
> 
> I tried using sigaction() to set SIGHUP to SIG_IGN, with SA_RESTART in
> sa_flags, but that makes no difference for me even rerunning the test
> more than 100 times.  This is the amended testcase patch:
> 
> http://oligarchy.co.uk/xapian/patches/xapian-ignoresighup1-testcase-with-SA_RESTART.patch
> 
> I also tried with 0 in sa_flags.
> 
> I wonder if there's some sort of race in the kernel and/or libc here
> which results in different behaviour sometimes, depending on the
> exact timing of things happening.
> 
> Cheers,
>    Olly




More information about the Xapian-devel mailing list