[Xapian-discuss] RE : QueryParser: aliases and OP_AND

Menard, Daniel Daniel.Menard at ehesp.fr
Wed Jan 20 18:54:15 GMT 2010


Hi Richard!

Thanks a lot for your detailed explanation.

> The short answer: it always combines each word of the options with
> OP_OR, and there is no configuration for this currently.

I thought about that a second time and, in fact, I think that what I'm asking for is stupid!

If Xapian was to interpret "alias:john AND alias:smith" as I asked:
(AUT1:john AND AUT1:smith) OR (AUT2:john AND AUT2:smith)
the same would apply to the "default index", the one you create by calling qp.add_prefix("",...).

In this case, a not-fielded query like "videos fun 2010" (without quotes) on a database with separate "fields" for type of document, tags and dates would translate as:
(type=videos AND type=fun AND type=2010) OR (date=videos AND date=fun AND date=2010) OR (tag= etc.)
and would certainly give no results.

It would also be a problem for some of my aliases which aggregate fields of different nature.

BTW, the API documentation for qp.add_prefix() clearly mention that terms are combined with OP_OR in this case:
http://xapian.org/docs/apidoc/html/classXapian_1_1QueryParser.html#dfd545b4ac739adc2e4171169a500f33

So once again, thank you for your time (and for switching on my brain!)

Daniel

> -----Message d'origine-----
> De : Richard Boulton [mailto:boulton.rj at googlemail.com] 
> Envoyé : mercredi 20 janvier 2010 16:53
> À : Menard, Daniel
> Cc : Xapian Discussion
> Objet : Re: [Xapian-discuss] QueryParser: aliases and OP_AND
> 
> 
> 2010/1/19 Menard, Daniel <Daniel.Menard at ehesp.fr>:
> > Hello,
> >
> > I'm wondering about how the QueryParser parses a query 
> containing an "alias" when the default operator is OP_AND
> > (by "alias", I mean a search field mapped to multiple term 
> prefixes).
> 
> The short answer: it always combines each word of the options with
> OP_OR, and there is no configuration for this currently.
> 
> I can see why it's not helpful in your situation, though.  I can't
> think of an easy solution inside the query parser, though.  What you
> want would seem to be to take each group of words in the query (ie,
> each group of words not separated by an operator) and parse them with
> each prefix, and then combine those options with OP_OR.  We were
> having some discussions on this subject on IRC recently - it would be
> nice to have a way to specify how the queryparser deals with each set
> of words it finds.
> 
> One workaround for you might be to parse the query multiple times,
> each time using one of the prefixes for the field, and OR the
> resulting queries together.  This would produce the result you wanted
> in the case of a search for "alias:(john smith)".  This approach would
> even work in the presence of other filters in the query.  It would
> fall down (or, at least, become overly complex) if you wanted multiple
> expansions for more than one field, though.
> 
> > So my questions: is the current QueryParser's behaviour the 
> intended one? Is there anything I can do to get the result I expect?
> 
> > Of course, the parsing is fine when default_op is OP_OR (as 
> it is by default), but results are also "strange for me" if I 
> try with OP_PHRASE:
> >
> > Xapian::Query(((AUT1:john:(pos=1) PHRASE 2 
> AUT1:smith:(pos=2)) OR (AUT2:john:(pos=1) PHRASE 2 
> AUT1:smith:(pos=2)) OR (AUT2:john:(pos=1) PHRASE 2 
> AUT2:smith:(pos=2)) OR (AUT1:john:(pos=1) PHRASE 2 
> AUT2:smith:(pos=2))))
> 
> This is the same problem, but looks a bit weirder because the query
> has been rearranged by Xapian to put the PHRASE parts at the bottom of
> the tree.  (A OR B) PHRASE (C OR D) is equivalent to (A PHRASE C) OR
> (A PHRASE D) OR (B PHRASE C) OR (B PHRASE D).
> 
> -- 
> Richard
> 
> Email secured by Check Point
> 



More information about the Xapian-discuss mailing list