[Xapian-discuss] Proper noun stemming

Matthew Somerville matthew at mysociety.org
Thu Mar 27 15:16:02 GMT 2008


Colin Bell wrote:
> Does anyone know why, what am I doing wrong?

As explained on http://www.xapian.org/docs/termgenerator.html :

"Now we index all words lowercased with positional information, and also 
stemmed with a 'Z' prefix (unless they start with a digit), but without 
positional information. By default a Xapian::Stopper is used to avoid 
indexed stemmed forms of stopwords (tests show this shaves around 1% off the 
database size)."

So it doesn't index stemmed stopwords, but does still index their unstemmed 
forms - otherwise you couldn't do a phrase search for something with a 
stopword in it.

ATB,
Matthew



More information about the Xapian-discuss mailing list