[Xapian-tickets] [Xapian] #180: Add support for CJK text to queryparser and termgenerator

Xapian nobody at xapian.org
Sun Aug 21 15:08:21 BST 2011


#180: Add support for CJK text to queryparser and termgenerator
-------------------------+--------------------------------------------------
 Reporter:  richard      |        Owner:  richard  
     Type:  enhancement  |       Status:  assigned 
 Priority:  normal       |    Milestone:  1.3.0    
Component:  QueryParser  |      Version:  SVN trunk
 Severity:  normal       |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------

Comment(by olly):

 OK, now we only add positional information for the single character CJK
 terms (which will save quite a bit of space).

 And quoted phrases containing CJK now work - currently they just add each
 CJK character to the phrase as a separate term.  We could also add the CJK
 bigrams as filters for the phrase, which should significantly cut down the
 number of cases we need to check positional data for, which will usually
 be faster, but that's just an efficiency tweak so I've left that for now.
 These changes are pushed to the git branch.

-- 
Ticket URL: <http://trac.xapian.org/ticket/180#comment:27>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list