[Xapian-discuss] indexing words with alternative spellings
Olly Betts
olly at survex.com
Thu May 13 03:06:44 BST 2010
On Tue, May 11, 2010 at 03:18:38PM +0200, Per Jessen wrote:
> Some languages (e.g. German and Danish) have special letters that are
> often written using two-letter combinations when the appropriate
> keyboard or medium is not available:
For German, you can use the "german2" stemmer which transliterates as
you describe.
There's also unac for more general accent normalisation:
http://www.nongnu.org/unac/
There's actually a version 1.8.0 not mentioned there (but Debian has it).
Not sure what's up, but the upstream page at http://www.senga.org/unac/ is no
longer there.
Cheers,
Olly
More information about the Xapian-discuss
mailing list