[Xapian-discuss] Proper noun stemming

Colin Bell colinabell at gmail.com
Thu Mar 27 12:47:33 GMT 2008


Thanks for that James

On 27 Mar 2008, at 12:26, James Aylett wrote:
>
> As one of the above documents says, the convention is to store
> unstemmed forms with positional information, so the proximity of
> 'Gordon' to 'Brown' is retained in the database, and PHRASE and NEAR
> searches will be able to take advantage of that. (So the search
> 'meeting "Gordon Brown"' should match the above well.)

This sounds ideal. Storing "Gordon" "Brown" and "Gordon Brown" and  
linking them is a great solution. The only trick is picking out proper  
nouns like "Gordon Brown" or "Prime Minister" during the stemming  
process to store them as phrases. Will TermGenerator be able to do  
this? I'm going through the docs on this right now.

Thanks very much for you help. There is so much to try get my head  
around!

Regards

Colin





More information about the Xapian-discuss mailing list