[Xapian-discuss] Stemming non-protein

James Aylett james-xapian at tartarus.org
Thu Mar 30 11:41:17 BST 2006


On Wed, Mar 29, 2006 at 12:18:45PM -0500, Peter Masiar wrote:

> Say, my user queries for "protein". Document might say "non-protein". 
> Will xapian match it? Is it possible to disable such matches?

Currently (I believe - Olly may need to correct me) what will happen
is that both "non" and "protein" will be generated as terms (well,
they'll be stemmed too), but someone searching for "non-protein" will
generate a PHRASE search "non" PHRASE(n) "protein" where n is
something appropriate (probably 2?).

So searching for "protein" will find anything containing
"non-protein", which isn't always what you want. (Probably isn't very
often what you want.)

What you probably would need if you wanted to avoid this would be to
generate "non-protein" as a term. ("protein" stemmed is still
"protein" in our English stemmer.)

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list