[Xapian-devel] ICU

Richard Boulton richard at lemurconsulting.com
Wed Apr 12 16:00:00 BST 2006


Olly Betts wrote:
> For Omega we also need encoding conversion, which I think inevitably
> needs a large bit of code or data.  Tcl's code for this is compact, but
> has 1.3MB of data files.  I don't see so much an issue with adding a large
> library dependency to omega, be it ICU, glib, using Tcl's code, or using
> an installed version of Tcl.  Or something else.

Another potential option is Simon Tatham's "libcharset".  I mainly 
mention this for completeness: I'm not sure how actively he's developing 
/ supporting this code, and it's unlikely to be installed already on 
someone's system.  On the plus side, it's easy to contact the author, I 
think it more than provides the encoding conversion routines we'd need, 
it's portable, and its utf8 conversion code is very compact.  (The full 
library compiles to about 650K on my machine, but most of that is 
compiled data tables - the utf8 code compiles to about a 2K object file).

There isn't a webpage for it: the subversion repository is at
svn://tartarus.org/main/charset/

and is web-viewable at 
http://www.tartarus.org/~simon-anonsvn/viewcvs.cgi/charset/

-- 
Richard



More information about the Xapian-devel mailing list