[Xapian-discuss] trailing signed char in QueryParser

Ralf Mattes rm at tuxteam.de
Mon Mar 6 18:08:58 GMT 2006


On Tue, Mar 07, 2006 at 02:58:47AM +0900, Sungsoo Kim wrote:
> Dear Olly,
> 
> I have encountered an unexpected thing.
> Please look at the following commands and results for test!
> 
>     $ python search.py -v C++
>     Performing query 'Xapian::Query(c++:(pos=1))'
>     0 results found
> 
>     $ python search.py -v c++
>     Performing query 'Xapian::Query(c:(pos=1))'
>     10 results found
> 
> I have known QueryParser looks up term list in the database
> from the following code.
> 
>     // If the suffixed term doesn't exist, check that the
>     // non-suffixed term does.  This also takes care of
>     // the case when QueryParser::set_database() hasn't
>     // been called.
>     if (db.term_exists(suff_term) || !db.term_exists(term)) {
>         term = suff_term;
>         it = p;
>     }
> 
> In my database the term 'c' exists, but 'C' doesn't exist.
> All the terms are indexed in lowercase in my database,
> because I knew QueryParser always changes terms to 
> lowercase.
> 
> Why QP does not convert the term to lowercase before it 
> calls db.term_exists() to look up term list?
> 
> Is there any reason I am not aware of?
> 

QP treats capitalized terms as "raw" terms, i.e. terms
that should not be stemmed. "Test" will be parsed to "Rtest",
"C" will be parsed to "Rc".

 HTH Ralf Mattes

> For better Xapian,
> 
> 
> Sungsoo Kim

> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss




More information about the Xapian-discuss mailing list