[Xapian-discuss] about stemming

Olly Betts olly at survex.com
Mon Apr 3 23:43:14 BST 2006


On Mon, Apr 03, 2006 at 08:08:21AM +0700, Perdana Panduwana wrote:
> - Is it possible that when I search for "footballer", footballer get more
> weighting than footballs, and when I search for "footballs", footballs get
> more weighting than footballer? It seems impossible because both words will
> be stemmed to footbal, so is there any setting to make this possible?

As you say, that's clearly not possible if you're searching terms
stemmed at index time.

If you also index the unstemmed form of every term, you could transform
each term T in the query into (T OR stem(T)).

I'm not convinced it'll improve retrieval results though.  I'd suggest
trying it with a quick prototype before investing a lot of time and
energy into it.

Cheers,
    Olly



More information about the Xapian-discuss mailing list