[Xapian-discuss] Filtering queries with many boolean terms

Olly Betts olly at survex.com
Thu Oct 8 17:29:48 BST 2009


On Thu, Oct 08, 2009 at 10:30:01AM -0400, Jason Tackaberry wrote:
> On Thu, 2009-10-08 at 01:07 +0100, Olly Betts wrote:
> > Unnecessary extra work happens both when replace_document() is called,
> > and also when flush() is (or when it happens implicitly).
> 
> Excellent.  I noticed the target for issue #250 is 1.2.0.  Is that still
> the current target, and if so, do you have any sense at this point
> roughly when this would be released?

The "1.2.0" milestone is currently a dumping ground for "stuff to
probably be addressed in 1.2.x", so don't read too much into that.
I've not looked at the pile recently, but I'd guess this is probably one
of the ones likely to get done sooner.

The 1.2.0 release was meant to be out in early September, but the last
few issues dragged out and ran into my trip back to the UK to clear out
and sell my house there, which has stalled progress for a while.

> Also, do you yet have some intuition as to what kind of speed
> improvement should be expected?

I'd think pretty substantial.  Work is broadly proportional to the
number of terms being modified, though there may be some overhead
in tracking what to change.  But if you have a few hundred terms
per document and only change one, you could easily see a 100 fold
speed up.

Cheers,
    Olly



More information about the Xapian-discuss mailing list