[Xapian-discuss] add_prefix() versus add_boolean_prefix()

Olly Betts olly at survex.com
Mon Nov 3 00:03:30 GMT 2008


On Tue, Oct 28, 2008 at 07:27:25PM +0100, Daniel Ménard wrote:
> But if the same request is written like this : [test author:(john doe)], 
> I get the following query :
> Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A(john))
> which looks strange for me ("doe" is not a filter anymore, extra 
> parenthesis before "john").

It's a bad example to use "author:" here, since that would naturally
be a free-text search, and it means that examples which looks reasonable
don't necessarily make much sense in the actual boolean prefix case.

Anyway, this behaviour is as expected currently - you can't apply a
boolean prefix to a subexpression so it parses the "(" as part of the
term.  In this case the subexpression isn't boolean, so as a better
example, it's like this where "type:" is a boolean prefix:

type:(html pdf)

I'm not really sure that makes a lot of sense (all I can think is to
treat it as we would: type:html type:pdf, which is to OR filters with
the same prefix, which isn't totally obvious behaviour either).

I can see that there's a natural meaning for this case, which I don't
think we currently handle:

type:(html OR pdf)

> A similar problem appear if I try a phrase search: [test author:"john 
> doe"] gives
> Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A"john))

I'm not really sure what you expect this to mean - a phrase isn't a
boolean sub-expression, and I wouldn't expect boolean filter terms to
have positional information.

Looking at a better example, what would you expect this to mean?

type:"html pdf"

Incidentally, http://trac.xapian.org/ticket/128 suggests it should be a
single filter term with a space in, which seems a reasonable way to
allow that to be specified.  So in this case, the term would be:

XTYPEhtml pdf

Cheers,
    Olly



More information about the Xapian-discuss mailing list