[Xapian-tickets] [Xapian] #150: Enhancements to Unicode support
Xapian
nobody at xapian.org
Tue Jun 7 20:17:18 BST 2011
#150: Enhancements to Unicode support
-------------------------+--------------------------------------------------
Reporter: olly | Owner: olly
Type: enhancement | Status: assigned
Priority: normal | Milestone: 2.0.0
Component: QueryParser | Version: SVN trunk
Severity: minor | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
Comment(by djcb):
FYI, I'm using Xapian, and I 'flatten' (normalize) strings before adding
them as terms;
my table-based implementation:
http://gitorious.org/mu/mu-ng/blobs/master/src/mu-str-normalize.c
it's sufficient for most latin-based accented character, and the strong
point (for speed/mem usage) is that it can flatten the strings _in place_.
For a more complete (and shorter) version, some of equivalent of
g_str_normalize could be used, where first the accents and strings are
separated, and after that the accent chars are removed.
--
Ticket URL: <http://trac.xapian.org/ticket/150#comment:11>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list