[Xapian-discuss] indexing words with alternative spellings

Per Jessen per at computer.org
Tue May 11 14:18:38 BST 2010


Some languages (e.g. German and Danish) have special letters that are
often written using two-letter combinations when the appropriate
keyboard or medium is not available:

ä = ae
ü = ue
ö = oe
æ = ae
ø = oe
å = aa
ß = ss 

(there are undoubtedly far more examples than those)

As a user of an index, I would like to be able to search for
e.g. "schaefer" and get matches on both 'ae' and 'ä' returned. Same if
I searched on 'schäfer'.  Is this something I would need to take into
account when I do the indexing or?


/Per Jessen, Zürich




More information about the Xapian-discuss mailing list