[Xapian-devel] [Developing and Starting with Xapian]
Olly Betts
olly at survex.com
Fri Jan 18 06:25:37 GMT 2013
On Thu, Dec 27, 2012 at 01:16:03AM +0530, Abhishek Shah wrote:
> Thanks Dan and Gaurav for the suggestion. It was interesting to read the
> project ideas and weighting schemes and learning to rank seemed
> interesting. I was familiar with unigram language modeling and BM25 and got
> a little more familiarity with the bigram language modeling referring to
> the project plan of Gaurav. I have gone through the basic project idea
> which tells that I need to implement some other weighting schemes like
> Divergence From Randomness. I have gone through the tutorial of Divergence
> from randomness and understood the theory. I would like to code and try out
> different urn randomness models. Can you suggest as to how to proceed for
> contributing in the weighting schemes project idea?
Sorry this didn't get a response earlier (I missed it until now due to
the holidays). I think we may have discussed it on IRC since, but
here's a response which will help those searching the list archives in
the future even if it's no longer useful to you.
There's a example of writing your own simple weighting scheme here:
http://getting-started-with-xapian.readthedocs.org/en/latest/howtos/weighting_scheme.html#custom-weighting-schemes
The part this doesn't currently cover is that if you want to use
various statistics in your subclass you need to tell Xapian so it
knows to make them available. You do this by calling need_stat()
in the constructor of your subclass, with the enum value(s) here:
http://trac.xapian.org/browser/trunk/xapian-core/include/xapian/weight.h#L36
For example, this is what BM25Weight does:
http://trac.xapian.org/browser/trunk/xapian-core/include/xapian/weight.h#L396
Cheers,
Olly
More information about the Xapian-devel
mailing list