[Xapian-devel] NearPostList and get_wdf

Yann ROBIN me.show at gmail.com
Mon Dec 29 13:09:14 GMT 2008


On Mon, Dec 29, 2008 at 1:50 PM, Richard Boulton
<richard at lemurconsulting.com> wrote:
>> So i thought that i could change de wdf in the NearPostList according
>> to the distance between words. But it seems that the get_wdf of the
>> NearPostList is never called ... Instead it's the get_wdf of the
>> ChertPostList that it is called.
>
> Indeed; the wdf is used in the weight calculation, and the weight
> calculation is performed on each "leaf" postlist.
>
> I'm not sure that modifying the wdf is really the way to go about this - it
> seems to me that you might do better to use a custom weight class, which
> factored in the frequencies of the individual terms, as well as their
> proximity.
>
> For an example of a postlist which combines several terms together and
> calculates a weight on them, take a look at the SynonymPostList (and
> corresponding OP_SYNONYM operator) on the "opsynonym" branch in SVN.  This
> combines the wdfs of the terms being "synonymed" together, and passes that
> into the standard weighting algorithm.  It has a few issues, though (which
> is why it's not on trunk, yet).  See http://trac.xapian.org/ticket/50
>

Ok thanks i'll take a look. But i just wanna point out that the get_wdf
method in the NearPostList is never called, it's like you implemented
it for nothing ?
And making a new weight class would be certainly the best way,
but i would need to have access to the the weight class that is
only available in the LeafPostList in protected ... Maybe you address
this issue in the SynonymPostList ?

>
> Feel free to open a feature request ticket, describing the feature that you
> would like to exist.  OP_NEAR as it is currently implemented is behaving as
> intended, though.
>

The ticket was more for the get_wdf not being called, i don't think this was
something intended.


Thanks for your response !

-- 
Yann



More information about the Xapian-devel mailing list