[Xapian-discuss] queryparser thinks ø is o

Olly Betts olly at survex.com
Tue Sep 13 05:21:12 BST 2005

On Sun, Aug 28, 2005 at 02:49:23PM +0200, R. Mattes wrote:
> On Mon, 2005-08-29 at 14:18 +0200, Marcus Ramberg wrote:
> > Thanks for the tips, however, disabling the action in normalizer  
> > makes the queryparser tokenize on æøå instead of including them in  
> > the term. where can I modify the tokenizer in queryparser to include  
> > high-ascii chars (or at least the ones I need).

You'd need to tweak it to treat accented letters as part of a word.

> I'm using some extentions/patches from Olly Betts that enable
> unicode - either you have to wait until Olly Betts is back or
> you have to nag him personally ;-} 
> I'm not shure about the status of his patches and i'd hate to
> release code that's considered non-public.

It's public (I've already posted the patches to the mailing list!)

> Anyway, i had to tweak the aptches to apply them to 0.9.2 (and had to
> change some signatures to get them to compile ...).

I'll hopefully get this cleaned up and merged in soon.


