[Xapian-tickets] [Xapian] #180: Add support for CJK text to queryparser and termgenerator

Xapian nobody at xapian.org
Thu Aug 18 03:01:07 BST 2011


#180: Add support for CJK text to queryparser and termgenerator
-------------------------+--------------------------------------------------
 Reporter:  richard      |        Owner:  richard  
     Type:  enhancement  |       Status:  assigned 
 Priority:  normal       |    Milestone:  1.3.0    
Component:  QueryParser  |      Version:  SVN trunk
 Severity:  normal       |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------

Comment(by olly):

 I've committed the latest patch on a branch in git, cleaned up a few
 things, and fixed a bug with dereferencing an iterator before the end
 check:

 https://github.com/ojwb/xapian-chinese-segmentation/tree/cjk-from-
 ticket-180

 The plan is to merge in the Chinese segmentation support as well, since
 that needs to hook in in very similar places.

 I noticed an issue with the term positions - currently the code blindly
 assigns a different position to every n-gram it generates, which doesn't
 seem a good approach.
 I'm not sure what the best approach is though.  The key thing is we want
 phrases and the NEAR and ADJ operators to work in a natural way for users.

-- 
Ticket URL: <http://trac.xapian.org/ticket/180#comment:24>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list