[Xapian-discuss] Stemming and Quoted Phrases
Olly Betts
olly at survex.com
Thu Oct 18 00:47:23 BST 2007
On Wed, Oct 17, 2007 at 04:35:03PM -0400, Mike Boone wrote:
> One thing I found from our conversion of 0.8.5 to 1.0.3 is that the
> quoted phrases are not stemmed (nor is the Z prefix applied) by the
> QueryParser in 1.0.3. Is there a way to get the QueryParser to stem
> the words in the quoted phrase but keep the positional requirements?
No - to keep the database size down, TermGenerator only stores
positional information for unstemmed terms so as things stand, if you
stemmed the terms, you couldn't keep the positional requirement.
The details are here:
http://www.xapian.org/docs/termgenerator.html
TermGenerator ought to be more configurable (the main reason it isn't
is that it took long enough to get 1.0.0 out the door as it was without
adding more things to the "to do" list), so this could perhaps be an
option in the future.
But I always felt it was wrong that quoted phrases were subject to
stemming before. Do you have some examples where it makes more sense to
stem them?
Cheers,
Olly
More information about the Xapian-discuss
mailing list