[Xapian-devel] Re: [Xapian-commits]7990: trunk/xapian-core/trunk/xapian-core/bin/trunk/xapian-core/tests/harness/

Mon Apr 2 05:54:19 BST 2007

On Mon, Apr 02, 2007 at 12:53:23PM +1000, Mark Hammond wrote:
> Olly writes:
> > I found that the WinsockInitializer mechanism just didn't work as
> > it currently is in SVN for TcpClient because the TcpClient constructor
> > creates the socket and passes it to initialise the parent class which
> > happens before the winsock_initializer member is initialised.
> 
> It did work for me.  I'm a little confused by the above though - my patch
> has the winsock initialization done via the construction of a static
> module-level variable - the constructor initialized Winsock.

Then Richard must have reworked this as the code checked in had
WinsockInitializer members of TcpServer and TcpClient.

Incidentally, we've generally tried to avoid relying of global static
object construction (because it simply doesn't happen with some
toolchains).  But it's fine if the code is inherently limited to
platforms where it's known to work.  And it's possible that the problem
no longer manifests in any C++ toolchain in active use today.

> However, I really don't care enough to track this down :)  It *still* seems
> to work OK for me, so that is fine.  The test results in:
> 
> apitest total: 448 tests passed, 126 failed, 11 skipped.

Oddly, I'm doing better with mingw:

          total: 452 tests passed, 122 failed, 11 skipped.

Perhaps that's down to changes I've made as I've tried to work out
what's going wrong.

The "remote" backend tests all fail (that's the "prog" variant of the
remote backend which isn't implemented for windows currently) which
is 92 tests.

> > The problem is that tests seem to randomly fail to connect to
> > the remote
> > backend, which I think must be down to timing issues with starting
> > xapian-tcpsrv versus trying to connect to it.  Under Unix we carefully
> > wait until xapian-tcpsrv outputs "Listening" to avoid this problem.
> > Perhaps adding a short "sleep" after starting xapian-tcpsrv
> > will provide
> > an acceptable workaround.
> 
> I too see failures that appear related to timing issues with the stooopid
> process launcher code I added, but haven't found the time to address that
> yet.
>
> A sleep() would probably work, as would redirecting the output of the
> program to a unique temp filename.

I've looked at this more now.  There are certain tests I can run just by
themselves and get this failure mode.

I didn't try a unique temp filename, but removing the output redirection
completely and running:

apitest -v -b remotetcp emptyquery1

results in output suggesting xapian-tcpsrv had started twice and apitest
just sitting there.  The task manager shows two xapian-tcpsrv processes
(I killed both and retried, and both are created by apitest rather than
one hanging around from an earlier run).

If I disable SO_REUSEADDR then I get output from the above command
suggesting that one xapian-tcpsrv starts and another fails.

Looking at emptyquery1, it actually does open the same database twice,
but not with overlapping lifetimes (and not intentionally).  So I guess
a sleep in just the right spot should help, though my attempts to find
that spot have so far failed!

> The *best* solution is obviously code
> that directly invokes the child process with redirected std handles, so we
> can check for the "listening" string, the special 69 return value, and
> otherwise work like the unix version.

The "69" (EX_UNAVAILABLE) handling is just a nice bit of polish.  It's
mostly there because it means a tinderbox machine doesn't fail tests
just because someone happened to run the testsuite by hand on the same
machine at the same time.

I wondered if it would work to start a new thread and run the "start
..." command from that using "popen" so we could wait for "Listening"
and then signal the parent thread to return.  Conceptually much like
the UNIX code, but a rather different implementation.  Perhaps that's
what you have in mind already.

> there appears to be no target named 'check' in the top-level MSVC makefile.
> However, that makefile does appear to automatically run all tests.  So after
> a top-level build, I always have apitest.exe executed, generating the number
> of failures I summarised before.  The make process then terminates due to
> apitest.exe failing.  I've not checked if other tests are expected to be run
> after apitest.

There are other test programs, but none of the others exercise the
remote backend so apitest is the only interesting one here.

> Yeah, I assumed it was due to 2 processes attempting to redirect to the same
> file.  For example, if I execute:
> 
> % python > \temp\delme.out
> 
> And then attempt to do the same thing again while the first is running, I
> get:
> 
> % python > \temp\delme.out
> The process cannot access the file because it is being used by another
> process.

I hadn't considered this issue.  It can't be helping, but it's not the
only cause.

Cheers,
    Olly