[Xapian-discuss] TermGenerator question for the single quote character

tata 668 tata668 at gmail.com
Mon Apr 6 15:49:00 BST 2009


Olly,

Are you saying that it would maybe work the way I want with the french 
stemmer? I currently use the default xapian installation, so the english 
stemmer I guess.

If using the french stemmer could help, can you point me to a link that 
would help me changing the stemmer? I don't see anything about this on 
the quickstart or on the stemmer information page ( 
http://xapian.org/docs/stemming.html ).

Thanks a lot for the help,

Julien



Olly Betts wrote:
> On Sun, Apr 05, 2009 at 07:18:08PM -0400, tata 668 wrote:
>   
>> I use the TermGenerator to index the french text "Cela m'excite" 
>> (without the quotes). When I do a search for "excite" after this 
>> indexation, I need it to be found. "excite" is a word on is own.
>>
>> Currently "excite" is not found but "m'excite" is...
>>     
>
> In 1.0.0, we changed to treating apostrophes as part of a word, and
> updated to a newer version of Snowball where the English stemmer
> deals with them.
>
> I think the correct way for this to work is for the other stemmers
> to also handle apostrophes (at least if their languages use them)
> as otherwise the word tokenisation required depends on the stemmer.
>
>   
>> Is there a setting I'm missing so that the single quote character act as 
>> a word delimiter?
>>     
>
> No, there's no such setting currently.
>
> Cheers,
>     Olly
>
>   


More information about the Xapian-discuss mailing list