[Xapian-tickets] [Xapian] #594: Add support for SCWS Chinese segmentation library

Xapian nobody at xapian.org
Mon Feb 25 09:59:14 GMT 2019


#594: Add support for SCWS Chinese segmentation library
-------------------------+-------------------------------
 Reporter:  olly         |             Owner:  olly
     Type:  enhancement  |            Status:  closed
 Priority:  normal       |         Milestone:  1.4.x
Component:  Library API  |           Version:
 Severity:  normal       |        Resolution:  incomplete
 Keywords:               |        Blocked By:
 Blocking:               |  Operating System:  All
-------------------------+-------------------------------
Changes (by olly):

 * status:  new => closed
 * resolution:   => incomplete


Comment:

 We now have support for CJK segmentation using ICU (merged in
 [f881f0bd1609]).

 The patches here are all sadly very outdated (mostly my fault).  But I
 think at this point closing this ticket makes the most sense.

 If SCWS does a better job for Chinese than ICU we could potentially
 support both (and indeed other segmentation algorithms).  I think for
 maintainability additional alternatives would each needs to be wrapped up
 cleanly in an iterator class in the same way that `CJKNgramIterator` and
 `CJKWordIterator` wrap the ngram and ICU algorithms.

--
Ticket URL: <https://trac.xapian.org/ticket/594#comment:4>
Xapian <https://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list