NEAR non-leaf subqueries
Olly Betts
olly at survex.com
Fri Jan 20 23:23:32 GMT 2017
On Fri, Jan 20, 2017 at 03:35:13PM +0100, Jean-Francois Dockes wrote:
> Olly Betts writes:
> > On Thu, Jan 12, 2017 at 07:53:21PM +0100, Jean-Francois Dockes wrote:
> >
> > > Recoll also supports multi-word synonyms which could potentially
> > > generate PHRASE subqueries inside NEAR queries, but this
> > > understandably already did not work with 1.2, so the multi-word
> > > expansions are only used when proximity is not involved (by the way,
> > > proximity of phrases does make sense in this case, if there is a
> > > wishlist somewhere, but it's admittedly not an issue that most users
> > > will be concerned with...).
> >
> > Another case for https://trac.xapian.org/ticket/508 I think.
>
> The ticket only lists OP_OR as subqueries
OP_OR is the example used in the description, but the ticket isn't only
about OP_OR - note "OP_OR, *etc*" in the description, the title says
"non-leaf subqueries", and other operators are explicitly discussed:
* OP_AND: https://trac.xapian.org/ticket/508#comment:8
* OP_AND_NOT: https://trac.xapian.org/ticket/508#comment:11
I've added a note about OP_NEAR/OP_PHRASE.
> > The code I pushed before wouldn't handle an OR of more than two things,
> > so you couldn't do a 3+-way stem expansion:
> >
> > (text OR texts) NEAR (search OR searches OR searched OR searching)
> >
> > But I've just pushed an update which will handle this.
>
> Ok, I hadn't even noticed the limitation. Dit it silently truncated the
> OR list ?
It would throw Xapian::UnimplementedError.
> But, actually, so does the previous version (commit 389dfb319a66), which
> explains why I had not understood what the limitation was.
>
> Both versions also work fine with "floor floor floor"p:
>
> (floors OR flooring OR floored OR floor) NEAR 13
> (floors OR flooring OR floored OR floor) NEAR 13
> (floors OR flooring OR floored OR floor)
>
> So: me happy but confused...
I suspect that at most two of those terms are present in any given
document in your database - the limitation was actually on the number of
terms returning positions together for the OR, not the number in the
query.
Cheers,
Olly
More information about the Xapian-discuss
mailing list