[Xapian-discuss] Question about synonyms and relevancy results.

James Aylett james-xapian at tartarus.org
Thu Jan 3 16:18:15 GMT 2008


On Wed, Jan 02, 2008 at 11:51:47PM -0700, Rusty Conover wrote:

> Why does the use synonyms decrease relevancy of the returned results?

Because the synonyms probably won't match documents that have the
original terms (in the general case), so there's a lower proportion of
terms in the query matching those documents.

You can tweak the weighting scheme to ignore the within-query
frequency of a term when generating weights (and hence percentage
relevancy) in the MSet: you want to set k3 to 0. This may have a
larger effect on the relevance calculations that you expect (and may
well change document ordering in the MSet), but may be worth playing
with.

I suppose in theory we could have an operator that acts as OP_OR but
returns the highest BM25 termweight or something (so the synonyms act
as an expansion inside the query, rather than outside as at the
moment), but I have no idea if that would be generally useful, or
practical with respect to any of the optimisations we do.

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list