[Xapian-discuss] Order of NOT operand?

Olly Betts olly at survex.com
Tue Sep 1 06:04:07 BST 2009


On Mon, Aug 31, 2009 at 10:01:58AM -0400, David Sauve wrote:
> I'm having a strange issue with NOT queries in my xapian backend for
> Django-Haystack.  The query string is generated through user input, and as
> such, the order is undetermined.

Hmm, "generated through"?  The query string should really *be* user
input.  It is almost inevitably a mistake to try to modify it before
passing it to Xapian.  If you want to apply other filtering, combine
queries, etc, then do that to the Xapian::Query object(s) produced.

> I wouldn't think that would matter, but
> the following two queries are generating different search results:
> 
> java AND NOT id:1 NOT id:2
> vs.
> NOT id:1 NOT id:2 AND java

What sort of prefix is "id"?

> Logically, I'd think this would be the same, but in practice, it's not.  The
> first format seems to generate random results, but the second, generates the
> correct results.

They aren't quite the same in practice - the first is:

((java NOT id:1) NOT id:2)

And the second (with FLAG_PURE_NOT enabled) is:

((<everything> NOT id:1) NOT id:2) AND java

Ideally the <everything> in the second case would be eliminated by the
optimiser, but I don't think it currently is.

But these should both match the same documents.

I'd check the parsed Query objects with get_description() to see if they
look right.

Cheers,
    Olly



More information about the Xapian-discuss mailing list