[Xapian-devel] Problems with /bin/cat and flintlock?
Samuel Williams
space.ship.traveller at gmail.com
Fri Apr 8 14:12:00 BST 2011
I've done some more debugging.
I opened the writable database using IRB and did strace:
Parent process (IRB, running x.close)
close(5) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
kill(8678, SIGHUP) = 0
waitpid(8678, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 8678
Child process (/bin/cat)
Process 8712 attached - interrupt to quit
read(0, 0x98dc000, 32768) = ? ERESTARTSYS (To be restarted)
--- SIGHUP (Hangup) @ 0 (0) ---
Process 8712 detached
So, in this case SIGHUP worked fine.
I wonder, could it be something to do with how /bin/cat is being run? For example, in this case, stdin/out were attached to a terminal, but in the case of Apache2/Passenger, this is probably attached to a log file.
So, when running from IRB, we can use lsof to see the standard file descriptors:
cat 8748 samuel cwd DIR 8,2 4096 2 /
cat 8748 samuel rtd DIR 8,2 4096 2 /
cat 8748 samuel txt REG 8,2 42816 131093 /bin/cat
cat 8748 samuel mem REG 8,2 1323460 278817 /lib/i686/cmov/libc-2.11.2.so
cat 8748 samuel mem REG 8,2 113964 278748 /lib/ld-2.11.2.so
cat 8748 samuel 0u unix 0xdcc3f3c0 0t0 1739038 socket
cat 8748 samuel 1u unix 0xdcc3f3c0 0t0 1739038 socket
cat 8748 samuel 4ww REG 8,2 0 1901174 /tmp/bob2/flintlock
I'm guessing that 0u and 1u are standard input and output respectively?
But, when I check the apache process, it seems fairly similar:
cat 9101 www-data cwd DIR 8,2 4096 2 /
cat 9101 www-data rtd DIR 8,2 4096 2 /
cat 9101 www-data txt REG 8,2 42816 131093 /bin/cat
cat 9101 www-data mem REG 8,2 1323460 278817 /lib/i686/cmov/libc-2.11.2.so
cat 9101 www-data mem REG 8,2 113964 278748 /lib/ld-2.11.2.so
cat 9101 www-data 0u unix 0xe2fc0580 0t0 1741378 socket
cat 9101 www-data 1u unix 0xe2fc0580 0t0 1741378 socket
cat 9101 www-data 4ww REG 8,2 0 1234737 /srv/www/www.oriontransfer.co.nz/xapian.db/flintlock
In fact, no real difference at all. However, it might depend on the parent process and how those sockets are attached. SIGHUP means that the writing end of the input has been closed, right?
I've also noticed one other peculiar fact. Occasionally, Passenger will kill the Ruby/Rack process and then /bin/cat becomes detached:
Firstly:
10045 10115 10043 10038 ? -1 Sl 33 0:05 | | \_ Passenger ApplicationSpawner: /srv/www/www.oriontransfer.co.nz
10115 10123 10043 10038 ? -1 S 33 0:00 | | | \_ /bin/cat
A while later:
1 10123 10043 10038 ? -1 S 33 0:00 /bin/cat
In this case, the parent process is 1 (init). Well, this behavior also seems a bit buggy. If the parent process dies, the child process needs to be cleaned up no matter what, right?
Well, thats my input for now. If you have any other ideas about what to look for, let me know.
Kind regards,
Samuel
On 8/04/2011, at 9:21 PM, Olly Betts wrote:
> On Fri, Apr 08, 2011 at 05:33:02PM +1200, Samuel Williams wrote:
>> I noticed that it was sent SIGHUP, but it didn't quit for some reason.
>> Maybe you need to change this to SIGKILL? I was wondering if you knew
>> what "= ? ERESTARTSYS (To be restarted)" meant?
>
> Currently drowning in GSoC applications (as the deadline looms), but
> using SIGKILL is probably a sane change and could help. Can you try
> that and see how you do?
>
> I'll try to make some sense of the logs later.
>
> Cheers,
> Olly
More information about the Xapian-devel
mailing list