[Xapian-devel] GSOC - 2013 - Introduction (Learning to Rank)

Parth Gupta pargup8 at gmail.com
Mon Mar 25 11:50:18 GMT 2013


Hi Mudit,

As Olly has pointed out, this year we are not planning to build up more or
on new ranking algorithms. Rather, we will consolidate the project with the
present ranking algorithms. Rather it would be interesting to incorporate
some/one feature selection algorithms.  See the Learning to Rank updated
project description on ideas pages.

If you are interested in working on this project, it would be great start
to fork Rishabh's branch and debug the code. That would give you much more
insight of the project and help you better formulate your application.

Regards,
Parth.

On Fri, Mar 22, 2013 at 6:37 AM, Olly Betts <olly at survex.com> wrote:

> On Thu, Mar 21, 2013 at 06:58:41PM +0530, Mudit Gupta wrote:
> > I am interested in "Learning To Rank" project.  If I am not wrong, I
> found
> > the framework incorporated by Parth in the cloned code. It needed some
> > refactoring in order to incorporate more algorithms and was done by
> Rishabh
> > and available in his git repo (https://github.com/rishabhmehrotra/xapian
> )
> > but is still not merged. So, I assume I should think of additions to the
> > code in Rishbh's repo.
>
> Yes, I think that's the best starting point.
>
> > Moreover, I noticed that SVM-rank, ListMLE and
> > ListNet is already present in the code. I am interested in addition of a
> > random forest approach and looking for appropriate libraries. I would be
> > great to get input by the Xapian community in terms of preference of
> > algorithms and open source libraries. It would also be great to know the
> > priority of the Letor project to the Xapian community.
>
> Parth and I talked this over recently, and we concluded that this year a
> major focus should be on consolidating the existing work.  That doesn't
> necessarily mean that new features can't be looked at, but one of the
> deliverables should really be a xapian-letor module which we're happy to
> tag as a stable release.  A project which adds more algorithms is
> interesting, but if the end result isn't useful to Xapian users, there's
> much less benefit to be had from it.
>
> One of the major things missing is a testsuite.  Without any automated
> tests, it's hard to have much confidence that the code works, and it
> makes it much harder to make changes to the code in the future without
> introducing new bugs.  So I think adding a testsuite is important.
> The harness from xapian-core is suitable, but testcases need writing,
> and the bugs that actually writing testcases will inevitably uncover
> need fixing.
>
> We should also look at what features are missing from xapian-core
> which would be useful for xapian-letor, and consider implementing them -
> especially if they have other potential uses.  Two that I'm aware of
> are:
>
> * Fundamentally, xapian-letor wants to take a Xapian::MSet object and
>   reorder it, so an API which allows that would be handy - then the
>   output of xapian-letor can be an Xapian::MSet object, allowing it to
>   be cleanly slotted into existing applications using the Xapian API.
>   An MSet reordering API also has other potential uses - for example,
>   clustering results.
>
> * Field-related features currently have to be calculated specially by
>   xapian-letor, but these would also be useful to have for other uses
>   (e.g. implementing BM25f) so tracking them in the database backend
>   in xapian-core is worth investigating.
>
> I'll update the entry on the project ideas page with the above shortly.
>
> Cheers,
>     Olly
>
> _______________________________________________
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130325/450247bb/attachment.htm>


More information about the Xapian-devel mailing list