[Xapian-tickets] [Xapian] #812: Stemming of proper nouns
Xapian
nobody at xapian.org
Fri Sep 3 06:04:26 BST 2021
#812: Stemming of proper nouns
--------------------------------+------------------------
Reporter: Olly Betts | Owner: Olly Betts
Type: defect | Status: new
Priority: normal | Milestone: 1.5.0
Component: QueryParser | Version: git master
Severity: normal | Keywords:
Blocked By: | Blocking:
Operating System: All |
--------------------------------+------------------------
Currently the QueryParser suppresses stemming for words with an initial
capital, with the assumption that these are proper nouns where stemming
can be unhelpful.
However, that's a bit English-centric - e.g. nouns in German always have
an initial capital, and names are inflected in some languages (Russian and
Czech are two I'm aware of but there are likely others).
We could scrap this special handling completely, but it seems useful for
some languages. Perhaps each stemmer should be able to report whether
it's desirable to do this or not?
This issue is present in 1.4.x but so far nobody has actually complained
about it. Therefore I think we should decide how to address it best
without the restriction of backportability, then we can look at that once
we have addressed it.
--
Ticket URL: <https://trac.xapian.org/ticket/812>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list