[Xapian-tickets] [Xapian] #394: Speed up phrase queries with a "settling pond"

Xapian nobody at xapian.org
Sun Aug 9 16:02:55 BST 2009


#394: Speed up phrase queries with a "settling pond"
-------------------------+--------------------------------------------------
 Reporter:  olly         |       Owner:  olly 
     Type:  enhancement  |      Status:  new  
 Priority:  normal       |   Milestone:       
Component:  Matcher      |     Version:  1.1.2
 Severity:  normal       |   Blockedby:       
 Platform:  All          |    Blocking:       
-------------------------+--------------------------------------------------
 The attached patch implements a "settling pond" to delay the checking of
 exact phrases which are "and-like with the root" (by which I mean if the
 phrase doesn't match, the whole query doesn't match).  We discard pond
 entries which are below the current min_weight, and when the pond fills
 up, or the postlist tree is done, we take the highest weighted entries
 from the pond, which makes it more likely we'll increase min_weight and so
 be able to discard lower weighted pond entries without needing to do the
 potentially expensive phrase check.

 This patch can dramatically improve query speeds for exact phrase queries
 for common terms when the positional data isn't cached.

 This could be extended to any OP_PHRASE or OP_NEAR check which is "and-
 like with the root", and also to perform more than one such check.

 The patch needs cleaning up in a few places, and the pond size should
 default to something sane (based on DB size perhaps?) but I'm putting it
 here to make sure it doesn't get lost or forgotten.

 Patch is against trunk r13285.

-- 
Ticket URL: <http://trac.xapian.org/ticket/394>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list