Letor: returning MSet after re-ranking

James Aylett james-xapian at tartarus.org
Thu Jul 28 15:15:31 BST 2016


On Thu, Jul 28, 2016 at 03:58:25PM +0530, Ayush Tomar wrote:

> As it was discussed, at present letor_rank(const Xapian::MSet & mset)
> method returns a vector of sorted docids after performing re-ranking,
> whereas ideally, it should return a re-ranked MSet.

Yes, absolutely; that way you can 'just' add LTR with some lines
between creating the MSet and iterating over it to display or
whatever.

> To be able to do this, I was thinking of adding
> rerank_mset(vector<double> updated_weights) method to mset.h, which
> basically updates the weight of each MSetItem in MSet with weights
> (or score, in terms of letor) obtained after re-ranking, and then
> sorts the vector<Xapian::Internal::MSetItems> items by these updated
> weights.

I don't know enough about the MSet internals to know if that's
safe. (If the MSetItems are constructed all at the beginning, then it
should be okay.)

I was thinking more generally that the user API we care about is
something like this:

```c++
// Given mset already constructed from an Enquire object
ranker = Xapian::Letor::ListNetRanker("path/to/model");
// either this:
mset.rerank(ranker);
// or this:
mset = ranker.rerank(mset);

// Now we just treat MSet as normal...
for (MSetIterator i = mset.begin();
     i != mset.end();
     ++i) {
    cout << i.get_rank() + 1 << i.get_weight() << " docid=" << *i << "\n";
}
```

I'd prefer to avoid adding things to the public API that don't get
used by end users. However because LTR is outside the Xapian build
tree, we can't easily give it privileged access to Xapian internals.

I'm not sure that doing it by passing just a list of doubles is the
right approach, either. Maybe we could make something like:

```c++
class Xapian:RankIterator;

class Xapian::MSetRanker {
  public:
    void set_mset(const MSet & matches);
    RankIterator begin();
    RankIterator end();
}
```

And then MSet::Internal::rerank(MSetRanker) sets the MSet that the
ranker is working over, and iterates over it getting back everything
it needs to build new Xapian::Internal::MSetItem objects and so
repopulate its items member.

J

-- 
  James Aylett, occasional trouble-maker
  xapian.org



More information about the Xapian-devel mailing list