[Xapian-discuss] Proper noun stemming

Colin Bell colinabell at gmail.com
Thu Mar 27 15:42:18 GMT 2008


Hurray, all is well again. Thanks Matthew.

I had only recently upgraded from 0.9x to 1.0.5 and was completely  
caught off guard with this when re-indexed. I have adjusted my code to  
look for the Z prefix and then to remove it before showing the terms.

Many thanks for your help.

Regards

Colin

PS I'm looking for a Xapian consultant with C / C++ knowledge who is  
up in the South West (Bristol) or who comes out this way now and again  
and spare a few hours / or a day if you come out specially. If your  
interested please email me personally with your rate.

On 27 Mar 2008, at 15:16, Matthew Somerville wrote:

> Colin Bell wrote:
>> Does anyone know why, what am I doing wrong?
>
> As explained on http://www.xapian.org/docs/termgenerator.html :
>
> "Now we index all words lowercased with positional information, and  
> also
> stemmed with a 'Z' prefix (unless they start with a digit), but  
> without
> positional information. By default a Xapian::Stopper is used to avoid
> indexed stemmed forms of stopwords (tests show this shaves around 1%  
> off the
> database size)."
>
> So it doesn't index stemmed stopwords, but does still index their  
> unstemmed
> forms - otherwise you couldn't do a phrase search for something with a
> stopword in it.
>
> ATB,
> Matthew
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss




More information about the Xapian-discuss mailing list