[Xapian-discuss] Encoding Oddities

Johannes Fahrenkrug jfahrenkrug at gmail.com
Wed Jan 26 00:49:56 GMT 2011


Hi Adam,

> It sounds like the Ä isn't being lowercased as it should at indexing
> How is Ruby's support for lowercasing of utf-8 chars?

Ruby's UTF-8 "support" is a joke in 1.8.x. But that was exactly the
problem. The "downcase" method of Ruby's String class didn't downcase
UTF-8 characters. There are two ways to get around it: If you're using
Rails, use "a string".mb_chars.downcase. Otherwise, require the
"unicode" gem and use Unicode::downcase("a string").

Cheers,

Johannes

>  Just a guess,
>
>    Adam
>
> --
>  "Här kommer rädslan, gamle vän                               Adam Sjøgren
>  När alla fjärilar i magen vaknar upp                   asjo at koldfront.dk
>  Viskar välkommen hem"
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>



-- 
springenwerk.com | github.com/jfahrenkrug | twitter.com/jfahrenkrug



More information about the Xapian-discuss mailing list