[Xapian-devel] [GSOC 2014] Indexing INEX dataset

James Aylett james-xapian at tartarus.org
Sat Mar 22 12:35:56 GMT 2014


On 22 Mar 2014, at 08:22, Parth Gupta <pargup8 at gmail.com> wrote:

> For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. 
> 
> But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere calculating the statistics.

This would need configuring, though, wouldn't it? Not everyone (and I'm thinking of people who don't index using omindex here) applies a wdf of 5 while indexing titles; they may apply a different non-1 number, or just leave it at 1 (and possibly apply weighting at search time).

J

-- 
 James Aylett, occasional trouble-maker
 xapian.org




More information about the Xapian-devel mailing list