[Xapian-devel] Composition of NEAR and OR

Olly Betts olly at survex.com
Thu Nov 16 16:47:51 GMT 2006


On Thu, Nov 16, 2006 at 01:38:22PM +0100, Jean-Francois Dockes wrote:
> Olly Betts writes:
>  > What happens is that the PostList is positioned on each document which
>  > matches an AND query, and then test_doc() is called.  See SelectPostList
>  > (parent class of NearPostList and PhrasePostList) for where this
>  > happens.
> 
> Because the "source" and "terms" postlists are references to an AndPostlist
> and its components, the "terms" lists get positionned automagically when
> next() is called on source ? Or what ? :)

Oh, I see why you are confused!

The terms vector is set up when the NearPostList/PhrasePostList is
constructed - it simply contains the pointers to *the same* PostLists
which the AndPostList tree uses, but in the original query order
(AndPostList is reordered so that the least frequent terms are checked
first, as that will generally minimise the work done).  So when the
AndPostList is advanced, all the PostLists in terms are because they're
just the same PostLists!

> The trick as I see it was that flatten_subqs() must not be called
> resursively on the object itself *which is not a NEAR query* anymore after
> the first transformation. 
> 
> flatten_subqs() is called on each of the subqueries instead, after the
> transformation. 

That sounds about right.

> Here are both links (can't remember if this list accepts attachments...):
>  http://www.recoll.org/xapian/xapNearDistrib.patch
>  http://www.recoll.org/xapian/xapNearDistrib.cpp

Thanks, I'll take a look.

I think text attachments are currently accepted, though for non-trivial
patches I've started to put them on the web and post a link instead.
Partly to avoid filling up subscribers mail boxes, and also because it
can be tricky to get a true copy of the patch from web-based list
archives (e.g. spam-protect email addresses can modify a patch).

Cheers,
    Olly



More information about the Xapian-devel mailing list