[Xapian-discuss] Omega datevalue search fails during scriptindex flush

James Aylett james-xapian at tartarus.org
Thu Dec 13 15:11:35 GMT 2007


On Thu, Dec 13, 2007 at 03:54:52PM +0100, Steije van Schelt wrote:

> Since there is no output after entering several enters, I assume
> scriptindex is indexing data.

Try attaching ptrace (or your system equivalent) to it and see if it's
hitting any system calls. Also, what's the process state (via top),
and is it using CPU time at all? It might be blocked on something.

> * The search I perform on omega is as follows:
> 
> omega?P=harry&B=XINja&B=XAR0&DEFAULTOP=and&DB=...&FMT=customxml&xDB=...&xFILTERS=--O&TOPDOC=0&HITSPERPAGE=20&MINHITS=50&DATEVALUE=4&START=20071206
> 
> * After a while (20-30 seconds), omega just returns a blank page. No
> errors nothing (not even in apache error logs).

Sounds like you're hitting the apache internal timeout. You can
reconfigure this, but it would be better to find out what's going
on. If you run omega on the command line, you get a testing interface
- does the same kind of query take as long then? (I expect it will.)

[Diversion sidebar

Apache's timeout makes it an awkward web server to use on its own in
this kind of situation. You can jack up the timeout, but that leaves
you vulnerable to DOS attacks (deliberate or not) if you have URIs
that will take a long time to return. It's not easy to come up with a
really nice alternative, because you'd need to get omega running under
FastCGI, or something like that. (Then you could use something that
doesn't block its process/thread on content generation in order to
front end everything, which should scale better.)

]

> If I remove the DATEVALUE=4&START=20071206 everything just works
> fine. If scriptindex is replacing or adding records the search works
> fine aswell. Only during flush it seems Xapian won't give any
> results.

We'd need to identify that scriptindex is actually flushing before
coming to that conclusion I think. Note that when scriptindex shows
that it's replacing/adding records, it isn't going to be hitting the
database on disk, so it won't affect the search process (which does
lend weight to the idea that it's in flush). However the way Xapian
flushes is designed to allow a reader concurrently to the writer, so
something isn't quite right here.

If you can find out what scriptindex is actually doing while it's
sitting there, that should help a little. There's something in the
date stuff that clearly isn't helping, but I don't know why not. (The
reader shouldn't have to block on the value table, should it?)

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list