GSoC 2016 Letor Stabilisation

Olly Betts olly at survex.com
Wed Mar 23 00:21:25 GMT 2016


On Mon, Mar 21, 2016 at 11:29:22PM +0530, Parth Gupta wrote:
> On top of what James has to say. I would recommend to focus first on
> VcamX's branch as he was working on API streamlining while v-hasu was
> implementing additional ranking algorithms. So have a look at it and just
> realign your thoughts while working on the proposal. He already tried to
> refactor questletor.cc into more independent tasks such as
> letor-prepare.cc, letor-train.cc etc.

James and I have discussed at some length how best to approach
stabilising the letor code as a project this year, and James wrote up
the result of that here:

https://trac.xapian.org/wiki/GSoCProjectIdeas/LearningtoRankStabilisation

I would strongly encourage students to plan their projects along those
lines.

> I have tried to give it a go to merge VcamX's master with xapian master and
> it lies here: https://github.com/parthg/xapian
> 
> Most of the conflicts are resolved except "MSet" related parts in enquire.h

I don't think the MSet hooks VcamX added are the best way to achieve
this.  For example, this is the new method added:

void MSet::update_letor_information(const vector<Xapian::MSet::letor_item> & letor_items_)

We want to find a clean way to allow letor to reorder an MSet which also
provide a generic interface for MSet reordering.  Tasks like
diversification are also likely to need such an interface, so it
shouldn't be specific to letor.  It ought to also be consistent in
design with the existing API, which the above isn't.

Cheers,
    Olly



More information about the Xapian-devel mailing list