Weighting recent results

Olly Betts olly at survex.com
Tue May 3 02:03:43 BST 2016


On Fri, Apr 22, 2016 at 12:23:15PM -0400, Alex Aminoff wrote:
> I did some digging and found a thread from 2011 talking about how to
> subclass Xapian::PostingSource in order to incorporate the date or
> recency of a document in its weighting:
> 
> http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856
> 
> As in that thread, I want to be clear that I don't want to sort by
> date, but rather incorporate date information into the score by
> which I sort the results. I may be able to stumble around and figure
> this out, but I wonder if any current xapian users have done
> something like this and how did it work out?

I know some people have done recency boosting along the lines of that
thread, but they don't seem to be speaking up about their experiences.

I've not done this directly myself, but the main trick is probably
finding a suitable amount to boost by, so that the relevancy from
recency and relevance from content combine in a balanced way.

> We are a perl shop, but I guess I will need to figure out some C++
> in order to do this?

Currently some work would be needed to pull this off in Perl.

Search::Xapian doesn't wrap PostingSource, so for 1.2.x you'd need
to write XS wrappers for this class, which isn't trivial if you want to
be able to subclass in Perl.

The new SWIG-based Perl bindings in 1.3.x wrap PostingSource, but don't
currently support subclassing in Perl (because SWIG's support for doing
so in Perl was added more recently).  Enabling it is probably fairly
easy.

However, some of the details of the SWIG-based Perl bindings may change
before they're declared stable in 1.4.x:

https://trac.xapian.org/ticket/523

That's one of the last two bugs blocking 1.4.0, and I'm currently
working on the other one.  As noted in that ticket, we might bump that
one for 1.4.0, but it'll be a high priority to address in early 1.4.x.

So it really depends what timescale you're looking at for getting this
implemented.

Cheers,
    Olly



More information about the Xapian-discuss mailing list