[Xapian-devel] Composition of NEAR and OR

Olly Betts olly at survex.com
Wed Nov 15 18:28:15 GMT 2006


On Wed, Nov 15, 2006 at 11:50:13AM +0100, Jean-Francois Dockes wrote:
> The following piece of code triggers an 'unimplemented' exception with the
> message:
>  "Can't use NEAR/PHRASE with a subexpression containing NEAR or PHRASE"
> 
>       Xapian::Query or1(Xapian::Query::OP_OR, 
> 		    Xapian::Query("one"), 
> 		    Xapian::Query("two"));
>       Xapian::Query or2(Xapian::Query::OP_OR, 
> 		    Xapian::Query("three"), 
> 		    Xapian::Query("four"));
>       Xapian::Query near(Xapian::Query::OP_NEAR, or1, or2);
> 
> I can't decide by looking at the code in omqueryinternal.cc if this is
> intentional or not.

It looks like it will flatten "(one OR two) NEAR three", but not with
an OR subquery on either side.

Looking at the history of this code, it's been essentially the same
since revision 3194 (over 5 years ago) when Richard created this file.
It looks like this was mostly restructuring, and there's similar code
in omquery.cc (but not in its own method) prior to this, but I think
that has the same behaviour as currently.  Looks like I originally
wrote it (over 6 years ago!)

I'm not sure this flattening is really the best way to handle this -
fixing NearPostList to handle non-LeafPostLists would be more efficient
I think.  I think all that really needs is a PositionList subclass which
can return (in order) all the positions in any of a list of
PositionLists, which isn't too hard.

> In debug mode, it does trigger the NEAR or PHRASE
> assertion at the top of flatten_subqs(), which gets called at some point
> for the query: 
>     ((one NEAR 2 three) OR (one NEAR 2 four))
> 
> which does not seem right or needed, 

That must be from a recursive call, since the only non-recursive call
only happens for NEAR or PHRASE.

> Is this "(x or y) near (z or t)" query supposed to work or not ? I'm
> willing to try and fix it if it should work, but this area of the xapian
> code certainly does not suffer from an excess of comments ...

I think it should be supported, but whether the current code is meant to
support it I'm less clear about!  I suspect we might have decided to
handle the easier case with a leaf query on one side of the NEAR/PHRASE
"for now".

However, you could try just returning from flatten_subqs if op isn't
OP_NEAR or OP_PHRASE and see if that does the job.  Calling
"get_description" on the restructured query should show if it worked
or not.

Cheers,
    Olly



More information about the Xapian-devel mailing list