[Xapian-discuss] best method for stemming

Alex Deucher alexdeucher at gmail.com
Tue Feb 7 22:15:49 GMT 2006


Hi,

    I am implementing an index using xapian via the perl bindings and
I'd like to know what the preferred method is for stemming.  Having
browsed through the archives I've seen several approaches.  I will be
indexing millions of documents so I want to make sure this is done
optimally.  I'm willing to sacrifice disk space for faster lookups. 
Is it better to stem while indexing or to stem the query and treat it
like a wildcard or am I off all together?  Right now as I iterate
through the document, I stem the words and add the stem to the index
at the same position as the non-stemmed word.

Thanks,

Alex



More information about the Xapian-discuss mailing list