[Xapian-tickets] [Xapian] #150: Enhancements to Unicode support

Xapian nobody at xapian.org
Tue Jun 7 20:17:18 BST 2011


#150: Enhancements to Unicode support
-------------------------+--------------------------------------------------
 Reporter:  olly         |        Owner:  olly     
     Type:  enhancement  |       Status:  assigned 
 Priority:  normal       |    Milestone:  2.0.0    
Component:  QueryParser  |      Version:  SVN trunk
 Severity:  minor        |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------

Comment(by djcb):

 FYI, I'm using Xapian, and I 'flatten' (normalize) strings before adding
 them as terms;
 my table-based implementation:
     http://gitorious.org/mu/mu-ng/blobs/master/src/mu-str-normalize.c
 it's sufficient for most latin-based accented character, and the strong
 point (for speed/mem usage) is that it can flatten the strings _in place_.

 For a more complete (and shorter) version, some of equivalent of
 g_str_normalize could be used, where first the accents and strings are
 separated, and after that the accent chars are removed.

-- 
Ticket URL: <http://trac.xapian.org/ticket/150#comment:11>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list