[Xapian-devel] [GSOC 2014] Some questions about Letor module

Parth Gupta pargup8 at gmail.com
Sun Mar 9 08:52:19 GMT 2014


Hi Jiarong Wei,



> 1. In
> https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal.cc#L299,
> there is a write_to_file method, which save RankList into "train.txt". But
> the format for "train.txt" is different from the one mentioned in
> http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm. And in
> https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal_refactored.cc#L716,
> Qid and DocID become optional. What format should we use for "train.txt"?
> Is there any sample "train.txt" available?
>
>
You can find a sample of training file in the resources of Learning-to-Rank
project on Xapian GSoC idea page.


> 2. In http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm, it
> mentioned "the first column is the relevance judgement". I think the value
> of the relevance judgement is just 0 or 1. But the code saves it as a
> "double". Is it just for convenience? Or I misunderstand the whole thing?
>

In the INEX set it is binary but for other datasets, it may be higher
integer values and sometimes real value. Hence.


>
> 3. I've got qrels file of INEX 2010, but I can find query file. How can I
> get it? I can't find it on INEX website.
>

Have you checked in the instructions about that I have recently added to
the project idea page? Basically, you have to register on INEX website to
obtain data.

Cheers,
Parth.

>
> Thank you!
>
> Jiarong Wei
>
> _______________________________________________
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140309/a769d0de/attachment-0001.html>


More information about the Xapian-devel mailing list