[Xapian-devel] [GSOC 2014] Some questions about Letor module

Parth Gupta pargup8 at gmail.com
Sun Mar 9 10:07:02 GMT 2014

The queries are usually referred as topics.

Thanks for your reply! For the third question: In
> https://inex.mmci.uni-saarland.de/data/documentcollection.jsp, I can
> find inex2010-article.qrels in 2010 assessment, but can't find query files.
> Could you send me the link?

2010: https://inex.mmci.uni-saarland.de/protected/adhoc/2010-topics.xml
2009: https://inex.mmci.uni-saarland.de/protected/adhoc/2009-topics.zip

> I have registered on INEX website. And I also need to download ``INEX 2009
> collection without annotation tags: (unofficial)`` on
> http://www.mpi-inf.mpg.de/departments/d5/software/inex/, right?

Right that would be documents to be indexed.


> Thank you!
> Jiarong Wei
> On Mar 9, 2014, at 0:52, Parth Gupta <pargup8 at gmail.com> wrote:
> Hi Jiarong Wei,
>> 1. In
>> https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal.cc#L299,
>> there is a write_to_file method, which save RankList into "train.txt". But
>> the format for "train.txt" is different from the one mentioned in
>> http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm. And in
>> https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal_refactored.cc#L716,
>> Qid and DocID become optional. What format should we use for "train.txt"?
>> Is there any sample "train.txt" available?
> You can find a sample of training file in the resources of
> Learning-to-Rank project on Xapian GSoC idea page.
>> 2. In http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm, it
>> mentioned "the first column is the relevance judgement". I think the value
>> of the relevance judgement is just 0 or 1. But the code saves it as a
>> "double". Is it just for convenience? Or I misunderstand the whole thing?
> In the INEX set it is binary but for other datasets, it may be higher
> integer values and sometimes real value. Hence.
>> 3. I've got qrels file of INEX 2010, but I can find query file. How can I
>> get it? I can't find it on INEX website.
> Have you checked in the instructions about that I have recently added to
> the project idea page? Basically, you have to register on INEX website to
> obtain data.
> Cheers,
> Parth.
>> Thank you!
>> Jiarong Wei
>> _______________________________________________
>> Xapian-devel mailing list
>> Xapian-devel at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140309/6cf31f25/attachment.html>

More information about the Xapian-devel mailing list