[Xapian-discuss] Stemming non-protein
James Aylett
james-xapian at tartarus.org
Thu Mar 30 11:41:17 BST 2006
On Wed, Mar 29, 2006 at 12:18:45PM -0500, Peter Masiar wrote:
> Say, my user queries for "protein". Document might say "non-protein".
> Will xapian match it? Is it possible to disable such matches?
Currently (I believe - Olly may need to correct me) what will happen
is that both "non" and "protein" will be generated as terms (well,
they'll be stemmed too), but someone searching for "non-protein" will
generate a PHRASE search "non" PHRASE(n) "protein" where n is
something appropriate (probably 2?).
So searching for "protein" will find anything containing
"non-protein", which isn't always what you want. (Probably isn't very
often what you want.)
What you probably would need if you wanted to avoid this would be to
generate "non-protein" as a term. ("protein" stemmed is still
"protein" in our English stemmer.)
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list