[Xapian-devel] Questions on letor module

Jia Xu dolremi at gmail.com
Tue Mar 4 23:02:38 GMT 2014


Thank you Parth. It is really helpful for me to understand the project.


On Tue, Mar 4, 2014 at 1:59 PM, Parth Gupta <pargup8 at gmail.com> wrote:

> Hi Jia,
>
>   I have several questions regarding the letor module,I looked at the
>> framework of learning to rank in xapian
>> http://rishabhmehrotra.com/gsoc/17.png, I am a little confused. Why
>> using deep learning to find unsupervised features in test data? Since in my
>> understanding, learning to rank model usually learn features from the
>> training data then apply the model to the test data? Why test set and
>> training set have different features? And deep learning is to extract
>> hidden features from the data set, I don't think it is necessary to use it
>> in this problem. Furthermore, I didn't see any implementation in the source
>> code for deep learning, is it actually included in letor?
>>
>
> The idea of the GSoC project proposed by Rishabh was based on extracting
> unsupervised features using deep learning on top of existing features based
> on term frequency and related statistics. Well, this is not a tested
> hypothesis that it would help but it was an added part. Lately we dropped
> idea of adding this deep learning module. So you dont see any code related
> to it.
>
>>
>>   For the source code
>> https://github.com/rishabhmehrotra/xapian/tree/397034af42c9b1998730160176d219d6f8f38b25/xapian-letor,
>> the last update is about 2 years ago, is that the latest version of the
>> code? For several files such as ranker.cc, evalmetric.cc, there is no
>> implementations of functions, I don't know if they have been implemented
>> somewhere in the module(as far as I read through the source code, I didn't
>> see any).
>>
>
> That is the latest version of the code and the starting point of this
> year's GSoC project. The ranker.cc is an abstract class and inherited by
> the implemented rankers such as SVM, ListMLE and ListNET you can see the
> corresponding definition can be found in .cc files. The evaluation part is
> yet to be completed as per the instructions given in evalmetric.h
>
>  For the tests,  are there any benchmark tests on SVM based or listnet
>> models on sample datasets and what the NDCG or MAP scores of them ( I
>> didn't see any measure methods have been implemented in the current
>> module)? And how about the cross validation for the training set? Is there
>> any method included in the current project?
>>
>
> For the SVM based model, there exist the benchmarking available at
> http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#IREvaluationofLetorrankingscheme
>
> Actually the first step of the new project will be generate this figure
> for SVM based model with the new refactored code which is mostly done
> during GSoC 2012 but never tested. We would appreciate if the prospective
> students of the Letor project can generate this value before the student
> selection deadline.
>
>
>>
>> For SVM method, I found letor_learn_model() has been commented out, but I
>> didn't find any other file contain this function (or maybe in
>> letor_internal.cc)?
>>
>> Finally I found a file called letor_internal_refactored.cc file, is that
>> the latest version of letor_internal.cc ? Is letor_internal.cc
>> still being used?
>>
>
> Right. The svmranker.cc is to be defined. Right now the SVM based ranker
> is available in only non-refactored format which lies in
> letor_internal_refactored.cc
>
> I think it is the best exercise to prepare the svmranker.cc from the
> letor_internal_refactored.cc by implemening necessary methods and
> generating the MAP score reported on INEX data that would give you a better
> grip of the code. I would love to see a patch on it.
>
> Cheers,
> Parth.
>
>
>> Thank you very much. I am waiting for your reply.
>>
>> --
>> Jia Xu
>>
>>
>> _______________________________________________
>> Xapian-devel mailing list
>> Xapian-devel at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-devel
>>
>>
>


-- 
Jia Xu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140304/2db23e48/attachment.html>


More information about the Xapian-devel mailing list