[Xapian-tickets] [Xapian] #394: Speed up phrase queries with a "settling pond"
Xapian
nobody at xapian.org
Sun Aug 9 16:02:55 BST 2009
#394: Speed up phrase queries with a "settling pond"
-------------------------+--------------------------------------------------
Reporter: olly | Owner: olly
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Matcher | Version: 1.1.2
Severity: normal | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
The attached patch implements a "settling pond" to delay the checking of
exact phrases which are "and-like with the root" (by which I mean if the
phrase doesn't match, the whole query doesn't match). We discard pond
entries which are below the current min_weight, and when the pond fills
up, or the postlist tree is done, we take the highest weighted entries
from the pond, which makes it more likely we'll increase min_weight and so
be able to discard lower weighted pond entries without needing to do the
potentially expensive phrase check.
This patch can dramatically improve query speeds for exact phrase queries
for common terms when the positional data isn't cached.
This could be extended to any OP_PHRASE or OP_NEAR check which is "and-
like with the root", and also to perform more than one such check.
The patch needs cleaning up in a few places, and the pond size should
default to something sane (based on DB size perhaps?) but I'm putting it
here to make sure it doesn't get lost or forgotten.
Patch is against trunk r13285.
--
Ticket URL: <http://trac.xapian.org/ticket/394>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list