[Xapian-devel] ICU

Olly Betts olly at survex.com
Wed Apr 12 13:56:39 BST 2006


On Wed, Apr 12, 2006 at 08:38:42AM +0200, Jean-Francois Dockes wrote:
> What's wrong with iconv for encoding conversion ?

The main problem is iconv_open.  As the Linux iconv_open man page puts it:

    The values permitted for fromcode and tocode and the supported
    combinations are system dependent.

The problem is that there's no standard accompanying API for discovering
what values are supported or which combinations.  So perhaps on some
platform I can't convert from encoding X to utf-8, but I could convert
from encoding X to Y and then Y to utf-8.  Or utf-8 may not be supported
at all.  I've read before that these are genuine problems with trying to
use iconv.

It's also not portably documented how to spell any particular encoding -
for GNU libiconv, it appears utf-8 is "UTF-8", but there's no assurance
that name will work on another implementation even if utf-8 is supported.

The GNU implementation seems pretty decent - it supports a lot of
encodings and can convert between any given pair.  So one option is to
use iconv where it's known to be decent, but use other code elsewhere.

Cheers,
    Olly



More information about the Xapian-devel mailing list