[Xapian-discuss] how to add polish stem and stop words?

Olly Betts olly at survex.com
Tue Jun 23 08:48:52 BST 2009


On Sat, Jun 20, 2009 at 11:09:39PM +0200, Rafi wrote:
> I've read one thread about polish stem, but there wasnt showed any
> solutions. Is there any possibility to add undefined stem by myself?

Was that this thread?

http://thread.gmane.org/gmane.comp.search.xapian.general/6682/focus=6683

What I said there is pretty much still the case - what's needed is
someone who understands Polish to evaluate the Polish stemmers available
to see if they're suitable for our purposes (which is to conflate forms
of words with a common meaning to improve recall).

It's simpler to integrate an algorithm written in Snowball, but provided
the licence is compatible, we can add support for algorithms written in
C or C++.

To include a new algorithm currently requires patching Xapian, (but we're
happy to consider including support for useful stemmers in future
releases).  You can't currently provide your own stemming algorithm by
subclassing Xapian::Stem - this came up recently so see that for
details:

http://thread.gmane.org/gmane.comp.search.xapian.general/7445/focus=7468

Cheers,
    Olly



More information about the Xapian-discuss mailing list