[Xapian-discuss] normal prefix vs. boolean prefix

Jim Lynch jim at fayettedigital.com
Sat Nov 8 12:33:48 GMT 2008


Torsten Foertsch wrote:

>Hi,
>
>just to clarify, what is the difference between a normal prefix and a 
>boolean prefix?
>
>If I understand it correctly a normal prefix is a way to give a name to 
>a certain part of the index. When the ranking is computed the values 
>influence the ranking of a resulting entity according to their weight.
>
>A boolean prefix also names a certain part of the index but its purpose 
>is to filter the result set literally without influence on the ranking.
>
>Assuming:
>
>  $qp->add_prefix(normal=>'S');
>  $qp->add_boolean_prefix(bool=>'S');
>
>the 2 queries "normal:something AND something else" and "bool:something 
>AND something else" will give the same result set but possibly with 
>different ranking. But even the ranking would be the same if the weight 
>of the "normal:something" part of the first query is set to 0.
>
>Is that true?
>
>Another question, is it possible to do something like the following in 
>Perl?
>
>Xapian::Query q(OP_AND, Xapian::Query(OP_SCALE_WEIGHT, q_title, FACTOR), 
>q_body);
>
>Just a guess:
>
>Search::Xapian::Query->new(OP_AND,
>                           Search::Xapian::Query->new(OP_SCALE_WEIGHT, 
>                                                      $q_title, 2.5),
>                           $q_body)
>
>Thanks,
>Torsten
>
>
>
>
>  
>
It may be easier to define a boolean field by saying what it isn't.  You 
don't do a free text search on boolean fields.   Boolean fields are used 
primarily to store terms that are not searched like text fields.  An 
example of a boolean field might be a field describing the year model of 
a car, or the gender of the author of the document.  These fields will 
not be "searched" like a document but be used to limit the set of 
documents that will be later searched.  In English the search might be 
give me all documents that were written by a male (gender).  Or give me 
all the documents that describe cars build in 1957 (year model) AND 
contain the term"soft".

For efficiency, the search engine first limits the possible set of 
documents by applying the boolean filters first follow by the search. 

I don't know if the boolean terms affect the ranking but I suspect not. 

I hope I haven't confused the issue.
Jim.




More information about the Xapian-discuss mailing list