[Xapian-tickets] [Xapian] #812: Stemming of proper nouns

Xapian nobody at xapian.org
Fri Sep 3 06:04:26 BST 2021

#812: Stemming of proper nouns
        Reporter:  Olly Betts   |      Owner:  Olly Betts
            Type:  defect       |     Status:  new
        Priority:  normal       |  Milestone:  1.5.0
       Component:  QueryParser  |    Version:  git master
        Severity:  normal       |   Keywords:
      Blocked By:               |   Blocking:
Operating System:  All          |
 Currently the QueryParser suppresses stemming for words with an initial
 capital, with the assumption that these are proper nouns where stemming
 can be unhelpful.

 However, that's a bit English-centric - e.g. nouns in German always have
 an initial capital, and names are inflected in some languages (Russian and
 Czech are two I'm aware of but there are likely others).

 We could scrap this special handling completely, but it seems useful for
 some languages.  Perhaps each stemmer should be able to report whether
 it's desirable to do this or not?

 This issue is present in 1.4.x but so far nobody has actually complained
 about it.  Therefore I think we should decide how to address it best
 without the restriction of backportability, then we can look at that once
 we have addressed it.
Ticket URL: <https://trac.xapian.org/ticket/812>
Xapian <https://xapian.org/>

More information about the Xapian-tickets mailing list