[Xapian-discuss] Filtering queries with many boolean terms

Olly Betts olly at survex.com
Mon Oct 5 21:50:44 BST 2009


On Mon, Oct 05, 2009 at 07:27:38PM +0100, James Aylett wrote:
> On Sat, Oct 03, 2009 at 09:15:48PM -0400, Jason Tackaberry wrote:
> 
> >         (foo) (tid:0 tid:1 tid:2 tid:3)
> 
> What are the brackets for? The '*' in the output is, I think,
> OP_SCALE_WEIGHT, which doesn't seem a good query structure for what
> you're trying to do.

Yes, "*" means OP_SCALE_WEIGHT.

> You actually want a FILTER query at top level; which you've figured
> out how to generate. I don't know why this isn't behaving as fast as
> you want (some tabulated figures on this, with information about the
> corpus size and so on might help others respond here).

As of 1.0.4, any scale factor is pushed down to the leaf level by the
query optimiser when it builds the postlist tree, and in 1.1.x it is
pushed into the weighting scheme.  So the different query
representations aren't *necessarily* actually executed differently.  In
particular, OP_FILTER should be executed the same as OP_AND with
OP_SCALE_WEIGHT 0 on one branch.

But I've not had time to look at this in detail yet I'm afraid.

Cheers,
    Olly



More information about the Xapian-discuss mailing list