<div dir="ltr"><div><div><div><div>Hi, <br><br></div>Before starting my proposal, I wanted to know what is the expected output of Letor module. Is it for transfer learning (i.e you learn from one dataset and leverage it to predict the rankings of other dataset) or is it for supervised learning?<br>
</div><br></div>For instance - Xapian currently powers the Gmane search which is by default based on BM25 weighting scheme and now suppose we want to use LETOR to rank the top k retrieved search results, lets take SVMRanker for an example, will it rank the Gmane's search results based on the weights learned from INEX dataset because the client won't be providing any training file. And also I don't think it'll perform good for two datasets of different distributions. So how are we going to use it?<br>
<br></div>PROPOSAL-<br><div><div><div><br>1.Sorting out Letor API will include -<br><ul><li>Implementing SVMRanker and checking its evaluation results against the already generated values.</li></ul><ul><li>Implementing evaluation methods. Those methods will include MAP and NDCG. (<i>Is there any other method in particular that can be implemented other than these two?</i>)</li>
</ul><ul><li>Check the performance of ListMLE and ListNet against SVMRanker.(<i>Considering both ListMLE and ListNet has been implemented correctly but we don't have any tested performance measurement of these two algorithms</i>. <i>Therefore I want to know what should be course of action for this?</i>)<br>
</li></ul><ul><li>Implementing Rank aggregator. I've read about <b>Kemmy-Young Method</b>. Can you provide me with the names of the algorithms based on what should be implemented here or what was proposed last-to-last year. Also is there a way to check any ranker's performance(<i>since INEX dataset doesn't provide ranking</i>).</li>
</ul><p>2. Implementing automated tests will include -</p><ul><li>For testing, 20 documents and 5 queries can be picked from the INEX dataset, put to test and checked against their expected outputs.</li></ul><ul><li>Implemented evaluation metrics can also be used to test learning algorithms.</li>
</ul><p>3.Implementing a feature selection algorithms-</p><ul><li>I have a question here. Why are we planning to implement feature selection algorithm when we have only 19 features vectors. I don't think it'll over-fit the dataset. Also from what I have learnt, feature selection algorithms(like PCA in classification) are used only for time or space efficiencies.</li>
</ul><p>Please do provide some feedback so that I can improve upon it.</p><p>-Mayank<br></p></div></div></div></div>