[Xapian-devel] Problems with /bin/cat and flintlock?

Fri Apr 8 06:39:32 BST 2011

Okay, here is a good example:

Child process (/bin/cat)

# strace -p 28902
Process 28902 attached - interrupt to quit
read(0, 0x89b0000, 32768)               = ? ERESTARTSYS (To be restarted)
--- SIGHUP (Hangup) @ 0 (0) ---
read(0, 

Parent process (Ruby/Xapian Bindings)- At this point I assume the database is being closed:
unlink("/srv/www/www.oriontransfer.co.nz/xapian.db/postlist.baseA") = 0
pwrite64(10, "\0\0\t2\1\33\331\33\331\0\215\37\371\37\355\37\341\37\326\37\313\37\277\37\262\37\224\37\210\37y\37"..., 8192, 24576) = 8192
pwrite64(10, "\0\0\t2\0\1\275\r\360\1'\37\371\37\350\2\344\5\331\34c\34R\0342\34\35\34\t\33\365\33"..., 8192, 16384) = 8192
pwrite64(6, "\0\0\t2\1\26\213\35\226\0i\37\371\37\7\36A\36 \0366\36\25\36+\32\204\32y\32n\32"..., 8192, 8192) = 8192
pwrite64(6, "\0\0\t2\0\r\32\0205\0\21\31\272\20C\r+\17E\rr\f\354\t0\1J\f\213\5$\1"..., 8192, 761856) = 8192
pwrite64(4, "\0\0\t2\1\0q\20\345\2]\37\371\37\356\37\267\37\254\37\241\37\226\37\213\37\200\37u\37j\37"..., 8192, 8192) = 8192
open("/srv/www/www.oriontransfer.co.nz/xapian.db/postlist.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "\262\22\5\200@\3\1\21\341B\201\1\0\0\262\22\374\237\201\1\0\302\377_S\37\336a\272\375\3\0"..., 35) = 35
fsync(3)                                = 0
close(3)                                = 0
fsync(10)                               = 0
rename("/srv/www/www.oriontransfer.co.nz/xapian.db/postlist.tmp", "/srv/www/www.oriontransfer.co.nz/xapian.db/postlist.baseA") = 0
pread64(10, "\0\0\t2\1\33\331\33\331\0\215\37\371\37\355\37\341\37\326\37\313\37\277\37\262\37\224\37\210\37y\37"..., 8192, 24576) = 8192
open("/srv/www/www.oriontransfer.co.nz/xapian.db/position.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "\262\22\5\200@\23\1\22\266\333\2\217\1\0\0\262\22\377\377\357\377\377\377\377\377\377\377\337\377\377\377\376"..., 37) = 37
fsync(3)                                = 0
close(3)                                = 0
fsync(9)                                = 0
rename("/srv/www/www.oriontransfer.co.nz/xapian.db/position.tmp", "/srv/www/www.oriontransfer.co.nz/xapian.db/position.baseA") = 0
pread64(9, "\0\0\t$\1\23h\27\207\1\33\37\371\37\354\37\337\37\320\37\304\37\267\37\251\37\235\37\220\37\204\37"..., 8192, 155648) = 8192
open("/srv/www/www.oriontransfer.co.nz/xapian.db/termlist.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "\262\22\5\200@\1\1\f\205\2]\0\0\262\22\327\373}\377\17\323o\0\0\0f\"\262\22", 29) = 29
fsync(3)                                = 0
close(3)                                = 0
fsync(6)                                = 0
rename("/srv/www/www.oriontransfer.co.nz/xapian.db/termlist.tmp", "/srv/www/www.oriontransfer.co.nz/xapian.db/termlist.baseA") = 0
pread64(6, "\0\0\t2\1\26\213\35\226\0i\37\371\37\7\36A\36 \0366\36\25\36+\32\204\32y\32n\32"..., 8192, 8192) = 8192
open("/srv/www/www.oriontransfer.co.nz/xapian.db/record.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "\262\22\5\200@\1\1K\205\2\322\4\0\0\262\22\357_\1\0\237\377\376\377\337\337\377\377w\376\177\374"..., 93) = 93
fsync(3)                                = 0
close(3)                                = 0
fsync(4)                                = 0
rename("/srv/www/www.oriontransfer.co.nz/xapian.db/record.tmp", "/srv/www/www.oriontransfer.co.nz/xapian.db/record.baseA") = 0
pread64(4, "\0\0\t2\1\0q\20\345\2]\37\371\37\356\37\267\37\254\37\241\37\226\37\213\37\200\37u\37j\37"..., 8192, 8192) = 8192
close(10)                               = 0
close(9)                                = 0
close(6)                                = 0
close(4)                                = 0
close(5)                                = 0
kill(28902, SIGHUP)                     = 0
waitpid(28902, 

Then here we are left hanging.

Once I do the following:

# kill -SIGKILL 28902

Then in the parent process things move forward:

waitpid(28902, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], 0) = 28902
--- SIGCHLD (Child exited) @ 0 (0) ---
select(10, [7 8], [], [], {0, 0})       = 0 (Timeout)
gettimeofday({1302241064, 99157}, NULL) = 0
select(10, [7 8], [2], [], NULL)        = 1 (out [2])
select(10, [7 8], [2], [], NULL)        = 1 (out [2])
write(2, "Index update finished.\n", 23) = 23
select(10, [7 8], [], [], {0, 0})       = 0 (Timeout)
time(NULL)                              = 1302241064
clock_gettime(CLOCK_MONOTONIC, {586863, 666397480}) = 0
clock_gettime(CLOCK_MONOTONIC, {586863, 666425477}) = 0
select(10, [7 8], [], [], {3599, 999972}

To answer your first question:
# uname -a
Linux ayako.oriontransfer.org 2.6.26-2-xen-686 #1 SMP Sun Nov 21 01:58:51 UTC 2010 i686 GNU/Linux

Kind regards,
Samuel

On 8/04/2011, at 2:41 PM, Olly Betts wrote:

> On Fri, Apr 08, 2011 at 02:56:22AM +1200, Samuel Williams wrote:
>> I've been having intermittent issues with the flintlock code - it
>> seems that the function FlintLock::lock is never returning and this is
>> locking up the Ruby process.
> 
> What OS is this on?  That's likely to be highly relevant.
> 
>> At this point, using strace I found that the application process
>> seemed to be stuck in on
>> 00219         ssize_t n = read(fds[0], &ch, 1);
>> 
>> Obviously child process was cat, nothing really interesting about that.
> 
> The child process should send a single character before it execs
> /bin/cat, which is what the parent is waiting to read there.
> 
> If the write() call in the child fails, then the child exits, so
> unless the OS fails to transfer the byte across the pipe, I struggle
> to see how we can end up in this situation.
> 
>> 00172         // Connect pipe to stdin and stdout.
>> 00173         dup2(fds[1], 0);
>> 00174         dup2(fds[1], 1);
>> 
>> Isn't this setting stdin and stdout to the same end of an existing
>> pipe? Does this make sense?
> 
> It's a bidirectional socket, so that's fine.
> 
>> Anyway, I thought I'd mention this because it is a consistent problem.
>> If there is anything you think I should do with strace, gdb, etc on
>> the processes next time it hangs, let me know.
> 
> It would be useful to attach gdb to the parent and child and do a
> backtrace in each (bt) to see exactly where we are.
> 
>> One option to fix the bug without really understanding the real issue
>> would be to use select in the parent thread, rather than read. Then,
>> use a timeout of a few seconds so that if the child doesn't acquire
>> the lock within x seconds, it is as good as failed.
> 
> I'd prefer to understand the issue rather than paper over it.  Locking
> is rather a critical operation to get right!
> 
> Also, it's rather unclear what a suitable threshold is - you can use
> fcntl locking over NFS if you run the lock daemon, so a few seconds to
> get a lock is probably not impossible with a busy NFS server.
> 
> Cheers,
>    Olly