[Xapian-discuss] How to use a custom stemmer from Python bindings?

Eugene! esizikov at gmail.com
Tue Feb 2 11:40:58 GMT 2010


Hi Olly,

Hunspell supports morphological analysis and stemming based on
language dictionaries, thus giving more precise (correct) stemmed
forms. That's why I'd like to use it for stemming instead of Xapian
integrated "algorithmic" stemmer.

2010/2/2 Olly Betts <olly at survex.com>:
> On Tue, Feb 02, 2010 at 01:22:11PM +0600, Eugene! wrote:
>> I'm using Xapian bindings for Python in my project. How could I use a
>> custom stemmer instead of the included one (Snowball)?
>
> You can't use a custom stemmer in place of a Xapian::Stem object currently
> - there's an experimental patch which allows this in ticket #186, but that
> couldn't be easily wrapped by SWIG so it has been left for now:
>
> http://trac.xapian.org/ticket/186
>
> You can just split the text into words yourself, stem them with your own
> algorithm, and then add them using Document.add_term() or
> Document.add_posting().
>
>> The one I'm
>> looking at right now is Hunspell (http://hunspell.sourceforge.net/)
>> which has Python bindings (http://code.google.com/p/pyhunspell/).
>
> How does it compare to Snowball?
>
> Cheers,
>    Olly
>



More information about the Xapian-discuss mailing list