[Xapian-discuss] Proper noun stemming
Colin Bell
colinabell at gmail.com
Thu Mar 27 12:47:33 GMT 2008
Thanks for that James
On 27 Mar 2008, at 12:26, James Aylett wrote:
>
> As one of the above documents says, the convention is to store
> unstemmed forms with positional information, so the proximity of
> 'Gordon' to 'Brown' is retained in the database, and PHRASE and NEAR
> searches will be able to take advantage of that. (So the search
> 'meeting "Gordon Brown"' should match the above well.)
This sounds ideal. Storing "Gordon" "Brown" and "Gordon Brown" and
linking them is a great solution. The only trick is picking out proper
nouns like "Gordon Brown" or "Prime Minister" during the stemming
process to store them as phrases. Will TermGenerator be able to do
this? I'm going through the docs on this right now.
Thanks very much for you help. There is so much to try get my head
around!
Regards
Colin
More information about the Xapian-discuss
mailing list