[Xapian-discuss] UTF-8 Corruption
James Aylett
james-xapian at tartarus.org
Thu Mar 20 14:11:06 GMT 2008
On Thu, Mar 20, 2008 at 02:08:00PM +0000, Colin Bell wrote:
> > There are ways to detect the character set of a file, though not
> > always 100% reliably.
>
> Can anyone recommend some c++ code to do this?
I assume, but don't know, that the Firefox/Mozilla ``magic'' charset
detector is in C or C++ (the one that Mark Pilgrim ported to Python).
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list