[Xapian-devel] A Little Help

Rishabh Mehrotra erishabh at gmail.com
Fri Jul 27 18:41:45 BST 2012


Hello Parth,

Thanks for the reply. I had similar concerns regarding the user
friendliness of the LETOR module but then I realized that we won't be
returning this RankList to questletor, the RankList will stay inside Letor
and will not figure anywhere in the user's code(/questletor). If  this
holds true then the user-friendliness of our module stays intact.

My requirements for making RankList recognized inside the Letor class are
based on the following points:

   - Its not letor_score() for which I was planning on using RankList in
   Letor. Sorry, I should have made this clear in my previous mail. *
   letor_score()* doesn't need to return a RankList. Even in the current
   implementation it returns map<Xapian::docid, double>. So no issues here.


   - As per the current implementation of prepare_training_file(), the
   entire training data is read into a list<RankList> and then this
   list<RankList> is to be saved to a file. This looks a bit complex as each
   RankList in list<RankList> has a vector<FeatureVector> and each
   FeatureVector has 4 associated variables which need to be saved on
   file(this includes a map<int,double>). Saving all this nested information
   seemed messy; I was a bit reluctant to go ahead with this, hence wanted to
   confirm this with you.

<Though we discussed this yesterday, but till then I hadn't looked into the
exact nature of data that was required to be stored in the file.>


   - *Possible solution:* If instead we create a list<RankList> *variable
   as part of Letor class* then the prepare_training_file() method would
   just update this variable and as long as we have an instance of the Letor
   class alive, we would have this variable to use in subsequent operations.
   Hence, we won't need to save the complex looking list<RankList> data to a
   file and then read it back.


   - We discussed on IRC yesterday that doing so would prevent users who
   want to use their own training file. If we look at the possibilities, 2
   cases arise:
      - *User has a training file:* We take in the training file, update
      Letor's list<RankList> variable using this file at the end of
      prepare_training_file() function and proceed normally.
      - *User doesn't have a training file:* If the user doesn't have a
      training file then we would want to use an already existing training file
      to do the training, which would require that we save the list<RankList>
      somewhere. An alternative to this is that we could use the model learnt
      from this data directly, that is, instead of saving this
list<RankList> we
      instead save the model parameters learnt using this data- which
we anyways
      do in save_model() function. Doing so eliminates the need for saving the
      RankList for future use without any extra effort.


*Problem with going ahead with this:*
I donot know how to include the ranklist header file in
xapian-letor/include/letor.h.

Please let me know if I have overlooked some point with respect to the
availability of training file and the feasibility/applicability of the
solution.

Regards,
Rishabh.

On Sat, Jul 28, 2012 at 12:50 AM, Parth Gupta <parthg.88 at gmail.com> wrote:

> Hi Rishabh,
>
> I think its better not to expose RankiList to Letor.h and make it better
> user friendly. So my suggestion is to convert RankList to the following
> statement in this method.
>
> std::map<Xapian::docid, double> letor_score(const Xapian::MSet & mset);
>
> So just convert the RankList in std::map<Xapian::docid, double> format in
> the methods where you need to return.
>
> Parth.
>
>
> On Fri, Jul 27, 2012 at 5:06 PM, Rishabh Mehrotra <erishabh at gmail.com>wrote:
>
>> Hi,
>> I had a little doubt: How do I make a RankList recognizable in Letor.h?
>> *letor.h* resides in *xapian/xapian-letor/include/xapian/* whereas *
>> ranklist.h* resides in *xapian/xapian-letor/*. I want a function in
>> letor.cc to return a RankList so the function declaration in letor.h
>> requires RankList to be recognized.
>>
>> Thanks.
>> Rishabh.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20120728/be835b04/attachment.htm>


More information about the Xapian-devel mailing list