[Xapian-tickets] [Xapian] #180: Add support for CJK text to queryparser and termgenerator

Xapian nobody at xapian.org
Fri Sep 25 11:37:53 BST 2009


#180: Add support for CJK text to queryparser and termgenerator
-------------------------+--------------------------------------------------
 Reporter:  richard      |        Owner:  richard  
     Type:  enhancement  |       Status:  assigned 
 Priority:  high         |    Milestone:  1.2.0    
Component:  QueryParser  |      Version:  SVN trunk
 Severity:  normal       |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------

Comment(by xaka):

 Updated patch attached.

 1. Where i should put cjkv headers/sources files?

 2. Yes, glib2 dependency not good because Xapian already has Unicode/UTF-8
 API. I agree, but i have no time while to completely rework cjkv code and
 because i've integrate Dijon's code "as is". One thing - Dijon/glib2 code
 will be used only if document has CJKV sequences, i.e. 99% backward
 compatible for non-CJKV documents :).

 3. How and where user should select CJKV-mode? What if user just have a
 big folder with many files which updates every day and every day this big
 folder is indexing. Or another example - international forums. There is no
 way to say "index this file/topic with CJKV-mode". We can try to optimize
 scanning and detecting CJKV sequence process.

 4. About your alternatively. Its already done in patch (if i'm right
 understand you). If indexable string doesn't have CJKV - will be used old
 algorithm.

 Saying simple - "No CJKV - patch will not be used and all staying as is.
 If there CJKV - we will use modified queryparser/termgenerator code".

 Lets continue discuss all things and i think i can help to complete
 integrate CJKV. Major work is done. Minor remains...

-- 
Ticket URL: <http://trac.xapian.org/ticket/180#comment:9>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list