[Xapian-devel] New Idea on Ranking in IR

Sun Apr 3 15:40:16 BST 2011

On Fri, Apr 01, 2011 at 02:48:28PM +0530, Parth Gupta wrote:
> In Laarning to Rank (Letor) we prepare the features which can represent a
> query document pair. So now after the initial retrieval we take say first 20
> or 30 documents and represent them in form of feature vactors, now based on
> the training data our supervised leaning will give a score to each document
> for a particular query. For example if this learning is from regression then
> we have to learn 'W' vector which will give a score to the document vector
> by dot product.
> 
> Here the features can be term frequency, TF-IDF score, BM25 Score etc, as
> good as many. For Learning there are many machine learning techniques
> available.

What would be your plan for gathering data to train with?  Some sort of
click-through measurements?

On Sun, Apr 03, 2011 at 12:37:27PM +0530, Parth Gupta wrote:
> Please give your feedback on the possibility of exploration of the idea so
> that I can incorporate those things in my application.

It seems an interesting project to me, though I'm not sure I know enough
about the are to offer a much in the way of useful insights.  I can
probably ask some stupid questions though.

But I'm certainly happy to consider an application from you for working
on this.

Cheers,
    Olly