NEAR non-leaf subqueries
Jean-Francois Dockes
jf at dockes.org
Thu Jan 12 18:53:21 GMT 2017
Olly Betts writes:
> On Wed, Jan 04, 2017 at 07:29:58AM +0100, Jean-Francois Dockes wrote:
> > Olly Betts writes:
> > > The ticket has a patch which attempts to handle the OR case (which seems
> > > to be the part you actually care about) but this suffers from issues with
> > > object lifetimes which get a bit involved in the details. Since there
> > > wasn't a working patch when we got to making the hard decisions about
> > > which tickets to bump to get 1.4.0 out, and since addressing this
> > > shouldn't require ABI changes, it got bumped.
> >
> > Thank you for this answer.
> >
> > I need to choose between three approaches:
> >
> > - Implement support at the application level.
> > - Shift back to 1.2
> > - Just wait for 1.4.x
>
> Or help fix up the patch in the ticket?
Yep. But I earnestly believe that I'm not up to the task of fiddling with
Xapian internals. You may remember that I gave it a try quite a long time
ago, (it was the very same issue actually), and that, if I remember well,
my change did not quite do what it was supposed to do...
> > I'd rather go back to 1.2 than used a patched 1.4 by the way.
>
> Once we have a working patch, it should be mergable into 1.4.x (I can't
> see why any ABI changes would be needed) so using a patched 1.4
> shouldn't be an issue.
My phrase was unclear: explanation: I could use a patched 1.4 on Windows
where libxapian is bundled with recoll, but I was thinking ahead to a
situation where I'd have a 1.2/1.4 choice on Linux, where bundling a
patched 1.4 would not be acceptable. In the latter case, I'd rather use 1.2
because of the NEAR issue.
> > This all depends on your expected schedule (I guess that this would have
> > been a better term than 'plan', which is indeed described in the ticket). I
> > am not asking for anything beyond information here. Do you have any idea of
> > the very approximate time when the change might be implemented ?
>
> I had another poke at the patch and have a reworked version which solves the
> object lifetime issue and works for some simple tests. Can you try it out
> and see if it works for you?
>
> https://trac.xapian.org/ticket/508#comment:13
>
> There are two limitations:
>
> * Only OP_OR subqueries are handled. I think supporting these would be a
> useful step forward by itself, and AIUI it's all you actually need.
Yes, my need arises from stem or synonym expansions occurring inside a NEAR
query.
This happens without the user doing anything special, so it's a problem
when it causes an error.
Recoll also supports multi-word synonyms which could potentially generate
PHRASE subqueries inside NEAR queries, but this understandably already did
not work with 1.2, so the multi-word expansions are only used when proximity
is not involved (by the way, proximity of phrases does make sense in this
case, if there is a wishlist somewhere, but it's admittedly not an issue
that most users will be concerned with...).
> * Currently the OP_OR subqueries can only have two subqueries of their own.
> Lifting this restriction needs a bit of work on the new
> OrPositionList class
> - the old patch used a series of pairwise OrPositionList objects, but the
> new patch needs a single one instead - the class needs reworking to handle
> that.
>
> So I think the second limitation needs addressing, and of course any bugs
> resolving.
I am not sure that I completely understand this paragraph, but, anyway,
although I have a bit of trouble reading my own code, I think that recoll
will only add flat OP_OR queries as subqueries of the NEAR one. I tested
the patch and it does seem to answer my selfish needs...
> I can't promise anything re schedule, but hopefully we can sort this out
> fairly soon. At least the solution for what's missing now is fairly clear -
> we probably want to put the sub-positionlists into a min heap.
See, you lost me with the last phrase, and that's why it's better that I
don't get into Xapian-core internals :)
Anyway it's good enough to know that a patch exists which will hopefully
make its way into 1.4.x, meaning that I have no need to work on a bad
application-level solution. Thanks !
Cheers,
jf
More information about the Xapian-discuss
mailing list