Hi Rishabh,<br><br>Actually, storing those four variables into a training file is in my opinion the best way to go ahead and producing a training file in the end. This training file can be generated in some directory lets say /etc or /var in xapian-letor.<br>
<br>Usually these training files are used to train the model and the corresponding model is saved.<br><br>I understand your point to store the information into a data structure but I am afraid that it will become a lot difficult to process the information outside the API, say analysis of features, data etc..<br>
<br>So lets stick to the convention and use the files.<br><br>Parth.<br><br><div class="gmail_quote">On Fri, Jul 27, 2012 at 11:11 PM, Rishabh Mehrotra <span dir="ltr"><<a href="mailto:erishabh@gmail.com" target="_blank">erishabh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello Parth,<div><br></div><div>Thanks for the reply. I had similar concerns regarding the user friendliness of the LETOR module but then I realized that we won't be returning this RankList to questletor, the RankList will stay inside Letor and will not figure anywhere in the user's code(/questletor). If this holds true then the user-friendliness of our module stays intact.</div>
<div><br></div><div>My requirements for making RankList recognized inside the Letor class are based on the following points:</div><div><ul><li>Its not letor_score() for which I was planning on using RankList in Letor. Sorry, I should have made this clear in my previous mail. <b>letor_score()</b> doesn't need to return a RankList. Even in the current implementation it returns map<Xapian::docid, double>. So no issues here.</li>
</ul><ul><li>As per the current implementation of prepare_training_file(), the entire training data is read into a list<RankList> and then this list<RankList> is to be saved to a file. This looks a bit complex as each RankList in list<RankList> has a vector<FeatureVector> and each FeatureVector has 4 associated variables which need to be saved on file(this includes a map<int,double>). Saving all this nested information seemed messy; I was a bit reluctant to go ahead with this, hence wanted to confirm this with you. </li>
</ul></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><Though we discussed this yesterday, but till then I hadn't looked into the exact nature of data that was required to be stored in the file.></div>
</blockquote></blockquote><div><ul><li><b>Possible solution:</b> If instead we create a list<RankList> <b>variable as part of Letor class</b> then the prepare_training_file() method would just update this variable and as long as we have an instance of the Letor class alive, we would have this variable to use in subsequent operations. Hence, we won't need to save the complex looking list<RankList> data to a file and then read it back.</li>
</ul><ul><li>We discussed on IRC yesterday that doing so would prevent users who want to use their own training file. If we look at the possibilities, 2 cases arise:</li><ul><li><b>User has a training file:</b> We take in the training file, update Letor's list<RankList> variable using this file at the end of prepare_training_file() function and proceed normally.</li>
<li><b>User doesn't have a training file:</b> If the user doesn't have a training file then we would want to use an already existing training file to do the training, which would require that we save the list<RankList> somewhere. An alternative to this is that we could use the model learnt from this data directly, that is, instead of saving this list<RankList> we instead save the model parameters learnt using this data- which we anyways do in save_model() function. Doing so eliminates the need for saving the RankList for future use without any extra effort.</li>
</ul></ul></div><div><br></div><div><b>Problem with going ahead with this:</b></div><div>I donot know how to include the ranklist header file in xapian-letor/include/letor.h.</div><div><br></div><div>Please let me know if I have overlooked some point with respect to the availability of training file and the feasibility/applicability of the solution.</div>
<div><br></div><div>Regards,</div><div>Rishabh.</div><div class="HOEnZb"><div class="h5"><div><br><div class="gmail_quote">On Sat, Jul 28, 2012 at 12:50 AM, Parth Gupta <span dir="ltr"><<a href="mailto:parthg.88@gmail.com" target="_blank">parthg.88@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Rishabh,<br><br>I think its better not to expose RankiList to Letor.h and make it better user friendly. So my suggestion is to convert RankList to the following statement in this method.<br>
<br>std::map<Xapian::docid, double> letor_score(const Xapian::MSet & mset);<br>
<br>So just convert the RankList in std::map<Xapian::docid, double> format in the methods where you need to return.<span><font color="#888888"><br><br>Parth.</font></span><div><div>
<br><br><div class="gmail_quote">On Fri, Jul 27, 2012 at 5:06 PM, Rishabh Mehrotra <span dir="ltr"><<a href="mailto:erishabh@gmail.com" target="_blank">erishabh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<div>I had a little doubt: How do I make a RankList recognizable in Letor.h? </div><div><b>letor.h</b> resides in <b>xapian/xapian-letor/include/xapian/</b> whereas <b>ranklist.h</b> resides in <b>xapian/xapian-letor/</b>. I want a function in letor.cc to return a RankList so the function declaration in letor.h requires RankList to be recognized.</div>
<div><br></div><div>Thanks. <span><font color="#888888"><br>Rishabh.<br><br>
</font></span></div>
</blockquote></div><br>
</div></div></blockquote></div><br>
</div>
</div></div></blockquote></div><br>