[Xapian-devel] non-snowball stemmer

Richard Boulton richard at lemurconsulting.com
Tue Jan 9 15:30:48 GMT 2007


Oleg V Obolenskiy wrote:
> Hi!
> 
> I am going to use  non-snowball russian stemmer with Xapian. There is a 
> good one at http://www.aot.ru. I've found that current implementation of 
> Xapian::Stem does not allow it (there is no public interface for 
> Xapian::Stem::Internal). Do you apply patches? Are there any 
> recommendations for writing patches?

Yes, contributions are welcome.  See the "HACKING" file in the 
xapian-core sources for details of how to submit patches.

Your patch should be against the current Xapian subversion head, if 
possible.  Also, note that Xapian's snowball stemmers are quite out of 
date - in particular, they don't use the UTF-8 encoding.  The stemmers 
are due to be updated for the next release (ie, the 1.0 release - they 
couldn't be updated for the 0.9.x series, because this would have broken 
existing databases).

For reference, the relevant section of the HACKING file is as follows:

Submitting Patches:
===================

If you have a patch to fix a problem in Xapian, or to add a new feature,
please send it to us for inclusion.  Any major changes should be discussed
on the xapian-devel mailing list first:
<http://www.xapian.org/lists.php>

We find patches in unified diff format easiest to read.  If you're using a
SVN checkout just use "svn diff" to generate the diff.  If you're working
from a tarball, compare against the original versions of files using
"diff -puN" (-p reports the function name for each chunk).

Please set the width of a tab character in your editor to 8 spaces. 
Failing to
do so will make it much harder for us to merge in your changes.

We will do our best to give credit where credit is due - if we have used
patches from you, or received helpful reports or advice, we will add 
your name
to the AUTHORS file (unless you specifically request us not to).  If you 
see we
have forgotten to do this, please draw it to our attention so that we can
address the omission.

-- 
Richard



More information about the Xapian-devel mailing list