[Xapian-discuss] Proper noun stemming
Colin Bell
colinabell at gmail.com
Thu Mar 27 15:42:18 GMT 2008
Hurray, all is well again. Thanks Matthew.
I had only recently upgraded from 0.9x to 1.0.5 and was completely
caught off guard with this when re-indexed. I have adjusted my code to
look for the Z prefix and then to remove it before showing the terms.
Many thanks for your help.
Regards
Colin
PS I'm looking for a Xapian consultant with C / C++ knowledge who is
up in the South West (Bristol) or who comes out this way now and again
and spare a few hours / or a day if you come out specially. If your
interested please email me personally with your rate.
On 27 Mar 2008, at 15:16, Matthew Somerville wrote:
> Colin Bell wrote:
>> Does anyone know why, what am I doing wrong?
>
> As explained on http://www.xapian.org/docs/termgenerator.html :
>
> "Now we index all words lowercased with positional information, and
> also
> stemmed with a 'Z' prefix (unless they start with a digit), but
> without
> positional information. By default a Xapian::Stopper is used to avoid
> indexed stemmed forms of stopwords (tests show this shaves around 1%
> off the
> database size)."
>
> So it doesn't index stemmed stopwords, but does still index their
> unstemmed
> forms - otherwise you couldn't do a phrase search for something with a
> stopword in it.
>
> ATB,
> Matthew
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
More information about the Xapian-discuss
mailing list