[Xapian-discuss] Stemming and Quoted Phrases

Olly Betts olly at survex.com
Thu Oct 18 00:47:23 BST 2007


On Wed, Oct 17, 2007 at 04:35:03PM -0400, Mike Boone wrote:
> One thing I found from our conversion of 0.8.5 to 1.0.3 is that the
> quoted phrases are not stemmed (nor is the Z prefix applied) by the
> QueryParser in 1.0.3. Is there a way to get the QueryParser to stem
> the words in the quoted phrase but keep the positional requirements?

No - to keep the database size down, TermGenerator only stores
positional information for unstemmed terms so as things stand, if you
stemmed the terms, you couldn't keep the positional requirement.

The details are here:

http://www.xapian.org/docs/termgenerator.html

TermGenerator ought to be more configurable (the main reason it isn't
is that it took long enough to get 1.0.0 out the door as it was without
adding more things to the "to do" list), so this could perhaps be an
option in the future.

But I always felt it was wrong that quoted phrases were subject to
stemming before.  Do you have some examples where it makes more sense to
stem them?

Cheers,
    Olly



More information about the Xapian-discuss mailing list