[Xapian-discuss] query time stemming and term weights

Jean-Francois Dockes jean-francois.dockes at wanadoo.fr
Thu Nov 17 17:43:23 GMT 2005


Olly Betts writes:
 > [...]
 > So that perhaps suggests that it's better to use the frequency of the
 > stem as you suggest.  Or at least that you are right to consider the
 > issue!
 > [...]
 > So my suggestion would be to do some tests and see if retrieval
 > effectiveness is actually made better/worse or left unchanged by
 > stemming at search vs index time.  I'd definitely be interested to hear
 > the results of any such tests.
 > [...]

Ok, so I understand that the issue is not obvious, I'll keep watching for
weird behaviour. For a more systematic approach I'd actually be at a loss
about how to test and what to look for.

 > Longer term, I'm interested in the idea of stemming at search time
 > (at least as an option).  It has several benefits such as allowing an
 > exact word search without having to index "raw" terms too, and allowing
 > choice of stemming language at search time.

Which is why I'm pursuing this approach. It would seem that keeping more
information at index time would allow more flexibility at search time (as
long as performance or volume does not become a problem).

J.F.



More information about the Xapian-discuss mailing list