[Xapian-discuss] How to ignore many occurrence of the same term
in one document for relevance computation?
James Aylett
james-xapian at tartarus.org
Thu Apr 12 21:15:37 BST 2007
On Thu, Apr 12, 2007 at 11:44:35AM -0700, Kevin Duraj wrote:
> I would like to ask how can I make Xapian to ignore relevance computation
> for documents that has many time occurrence of the same term. Or differently
> to say I would like to have Xapian ignore relevance computation based on how
> many times terms is in document.
You can do this by fiddling with the Weight mechanism. The key here is
to drop the wdf (within document frequency) of each term. I think you
can just set k1_ to 0 in BM25Weight, but I've never tried it.
Use Xapian::Enquire::set_weighting_scheme() to replace the default
weighting scheme (you can construct a BM25Weight object with the
relevant parameters to pass in to this).
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list