[Xapian-discuss] QueryParser lowercase / uppercase and stemming

Olly Betts olly at survex.com
Wed May 17 23:31:21 BST 2006


On Wed, May 17, 2006 at 04:17:39PM +0200, dd wrote:
> The second thing I wondered about, is there any possibility to forbid 
> queryparser lowercasing of the query string. At least for exact phrase 
> matching I found this quite meaningful. (Data is indexed both, upper- 
> and lowercase)

I just realised I missed this.

I'm not convinced it's actually a sensible way to index - the only
example I know of where it's useful is NeXT computers, which got
merged into Apple about a decade ago.  Especially in these days of
ubiquitous web search, nobody sane would pick a common word and
just vary the capitalisation to name their product or company.  And
enough people will ignore the official spelling and write "NEXT
Computers" or "Next computers" that being pedantic about capitalistion
also has a negative effect on retrieval performance.
 
But it shouldn't be too hard to add an option for it.  I'll take a
look when I'm next fiddling with the QueryParser.

Cheers,
    Olly



More information about the Xapian-discuss mailing list