xapian-letor: FeatureVector discussion

Parth Gupta pargup8 at gmail.com
Mon Jun 27 13:38:58 BST 2016


Hi Ayush

Thanks for bringing up the issue for discussion. It is still possible to
use feature IDs with Enums without the order. It is just we are defining in
a way we need. Usually a good approach is to group features with some
similarity e.g. term-document scores based features such as BM25 score, LM
score etc are in a separate group with a specific ID range. The addition of
new features can extend the present range or can be accommodated in the
present range.

The rankers will rank a particular instance with the present features (not
necessarily, all and in order). In fact, a user can specify which features
s/he wants to work with and the feature manager will ensure calculation of
them and update 'fvals'.

I am still missing some bits on the second approach, can you please give a
little more information on it?

Cheers
Parth


On Mon, Jun 27, 2016 at 5:46 PM, Ayush Tomar <ayushtomar at gmail.com> wrote:

> Hello James, Parth,
>
> Following our discussion on IRC and on code review, the way FeatureVector
> class works needs some discussion.
>
> Presently, the FeatureVector class is defined as follows, with a fixed
> number of feature count (19):
>
> class FeatureVector::Internal : public Xapian::Internal::intrusive_base{
>     friend class FeatureVector;
>     double label;
>     double score;
>     std::map<int,double> fvals;
>     int fcount;
>     Xapian::docid did;
>
> The two approaches that were discussed were:
> 1. Using enums as IDs for features in fvals.
> 2. Making fvals into a configurable vector of feature values.
>
> The issues were that the first way would still assume an order in which
> the features occur, and the second way would require the feature generation
> code to be changed into lots of little classes, which might be an overhead
> right now but would be a good functionality to have in future.
>
> What would be the best approach here?
> --
>
> ----------------------------------------------------------------------------
> Kind Regards,
> Ayush Tomar | My Webpage <http://ayshtmr.xyz> | LinkedIn
> <https://in.linkedin.com/in/ayushtomar>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160627/a99efd7f/attachment.html>


More information about the Xapian-devel mailing list