[Xapian-devel] [GSOC 2014] Some questions about Letor module

Jiarong Wei vcamx3 at gmail.com
Sun Mar 9 10:12:45 GMT 2014


Thank you very much, Parth!

Jiarong Wei

On Mar 9, 2014, at 3:07, Parth Gupta <pargup8 at gmail.com> wrote:

> The queries are usually referred as topics.
> 
> Thanks for your reply! For the third question: In https://inex.mmci.uni-saarland.de/data/documentcollection.jsp, I can find inex2010-article.qrels in 2010 assessment, but can’t find query files. Could you send me the link?
> 
> 2010: https://inex.mmci.uni-saarland.de/protected/adhoc/2010-topics.xml
> 2009: https://inex.mmci.uni-saarland.de/protected/adhoc/2009-topics.zip
>  
> I have registered on INEX website. And I also need to download ``INEX 2009 collection without annotation tags: (unofficial)`` on http://www.mpi-inf.mpg.de/departments/d5/software/inex/, right?
> 
> Right that would be documents to be indexed.
> 
> Parth. 
> 
> Thank you!
> 
> Jiarong Wei
> 
> On Mar 9, 2014, at 0:52, Parth Gupta <pargup8 at gmail.com> wrote:
> 
>> Hi Jiarong Wei,
>> 
>>  
>> 1. In https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal.cc#L299, there is a write_to_file method, which save RankList into “train.txt”. But the format for “train.txt” is different from the one mentioned in http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm. And in https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal_refactored.cc#L716, Qid and DocID become optional. What format should we use for “train.txt”? Is there any sample “train.txt” available?
>> 
>> 
>> You can find a sample of training file in the resources of Learning-to-Rank project on Xapian GSoC idea page.
>>  
>> 2. In http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm, it mentioned "the first column is the relevance judgement”. I think the value of the relevance judgement is just 0 or 1. But the code saves it as a “double”. Is it just for convenience? Or I misunderstand the whole thing?
>> 
>> In the INEX set it is binary but for other datasets, it may be higher integer values and sometimes real value. Hence.
>>  
>> 
>> 3. I’ve got qrels file of INEX 2010, but I can find query file. How can I get it? I can’t find it on INEX website. 
>>  
>> Have you checked in the instructions about that I have recently added to the project idea page? Basically, you have to register on INEX website to obtain data.
>> 
>> Cheers,
>> Parth.
>> 
>> Thank you!
>> 
>> Jiarong Wei
>> 
>> _______________________________________________
>> Xapian-devel mailing list
>> Xapian-devel at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-devel
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140309/182cd2cf/attachment-0001.html>


More information about the Xapian-devel mailing list