[Xapian-devel] Feature Selection algorithm

Sun Mar 16 20:21:13 GMT 2014

Hi Parth,

In this research
paper<http://research.microsoft.com/en-us/people/tyliu/fsr.pdf> of
feature selection algorithm for ranking, the importance scores of features
is described as-

*"*We first assign an importance score to each feature. Specifically, we
propose using an evaluation measure like MAP and NDCG (the definitions of
them will be given in Section 3) or a loss function (e.g. pair-wise ranking
errors [10][13]) to compute the importance score. In the former, we first *rank
instances(1)* using the feature, evaluate the performance in terms of the
measure, and then take the evaluation result as the importance score. In
the latter, we also rank instances using the feature, and then view a score
inversely proportional to the corresponding loss as the importance score.
Note that for some features larger values correspond to higher ranks while
for other features smaller values correspond to higher ranks, when
calculating MAP, NDCG or the loss of ranking models, *we actually sort the
instances for two **times (in the normal order and in the inverse order),
and take the **larger score as the importance score of the feature.(2)**"*

1. Is it Ok if we rank them with SVMRanker. SVMRanker is a linear kernel
SVM so how did you tune the parameter C(penalty for error term)? Did you
use Grid Search for C?

2. I couldn't understand what they mean by these lines in bold. Could you
please explain me?

*PS*: I've send a proposal for Letor. It'll be great if you could review it
and tell me if any detail is missing or I've missed out something so that I
can improve upon it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140317/de3fc680/attachment.html>