[Xapian-discuss] Question about synonyms and relevancy results.
James Aylett
james-xapian at tartarus.org
Thu Jan 3 16:18:15 GMT 2008
On Wed, Jan 02, 2008 at 11:51:47PM -0700, Rusty Conover wrote:
> Why does the use synonyms decrease relevancy of the returned results?
Because the synonyms probably won't match documents that have the
original terms (in the general case), so there's a lower proportion of
terms in the query matching those documents.
You can tweak the weighting scheme to ignore the within-query
frequency of a term when generating weights (and hence percentage
relevancy) in the MSet: you want to set k3 to 0. This may have a
larger effect on the relevance calculations that you expect (and may
well change document ordering in the MSet), but may be worth playing
with.
I suppose in theory we could have an operator that acts as OP_OR but
returns the highest BM25 termweight or something (so the synonyms act
as an expansion inside the query, rather than outside as at the
moment), but I have no idea if that would be generally useful, or
practical with respect to any of the optimisations we do.
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list