[Xapian-devel] Explanation of how Eset works

Olly Betts olly at survex.com
Fri Jan 11 22:35:07 GMT 2013


On Fri, Jan 11, 2013 at 12:42:32AM +0530, aarsh shah wrote:
> So basically an ESET is formed by ranking terms based on the combined
> weights((by using something similar to BM25) assigned to the documents
> in the Rset (formed by the top 5 entries in the MSET or selected by us
> ) which are present in the term's posting list,right ?

Yes, that's an accurate summary.

The weighting formula used for generating the ESet is currently
hard-coded to be the original probabilistic formula, which is
essentially BM25 with particular parameters.

We should probably allow this formula to be specified by the user, like
we do for document weights (I've just added that to the ideas list, as
it didn't seem to be there already).

Cheers,
    Olly



More information about the Xapian-devel mailing list