[Xapian-discuss] Narrowed the problem down a bit

Olly Betts olly@survex.com
Tue, 27 Apr 2004 01:52:56 +0100

On Mon, Apr 26, 2004 at 02:45:36PM +0100, Donald Fisk wrote:
> The one thing I can think of is that the precentage weight of the match 
> in the offending query  (for all matching documents) is 1.

That looks incorrect, but it's actually unrelated to your problem.

Your problem lies here:

>     for i,word in enumerate(text.split()):
>         document.add_posting(stemmer.stem_word(word),i)

The stemmers don't lowercase, so in this:

>     xapianAdd("  Jet skiing (3000 m away) ",
>               xapianDB)

"Jet" is indexed as "Jet".

However Xapian::QueryParser expects index terms to have been lower-cased
and stemmed, so the phrase "jet skiing" doesn't find any matches.

Try changing this line:

>     for q in ['ski NOT "jet skiing"','ski']:

to this:

>     for q in ['ski NOT "jet skiing"','"jet skiing"','ski']: