[Xapian-devel] GSoC 2011 Weighting Schemes

Olly Betts olly at survex.com
Tue Mar 29 06:40:50 BST 2011


On Mon, Mar 28, 2011 at 08:26:05PM +0800, wuwenjin wrote:
> As described in http://terrier.org/docs/current/dfr_description.html, there
> are many DFR models, which models will to be implemented in Xapian.

DPH is apparently very effective, and it's parameter free (not having
to tune parameters to get the best results is good), so it would
definitely be good to have that one.

I'm not sure which others are the most interesting.  Some models are
better for some situations than others - the page you link to mentions
"classic ad-hoc tasks" and "tasks that require early precision", but I
am not sure which amongst those is the best option.

I'd suggest picking a representative selection, and aiming to do those.

There's also scope for implementing DfR query expansion in
Enquire::get_eset() if that interests you.  That's probably more
involved since the weights used there aren't pluggable already, so might
be a good "if there's time at the end" thing to look at, once you're
familiar with the query weighting.

Cheers,
    Olly



More information about the Xapian-devel mailing list