[Xapian-tickets] [Xapian] #378: Optimise MultiAndPostList using current weight of lhs.
Xapian
nobody at xapian.org
Sat Jun 13 13:23:42 BST 2009
#378: Optimise MultiAndPostList using current weight of lhs.
-------------------------+--------------------------------------------------
Reporter: richard | Owner: richard
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.1.2
Component: Matcher | Version: SVN trunk
Severity: normal | Keywords:
Blockedby: | Platform: All
Blocking: |
-------------------------+--------------------------------------------------
Comment(by olly):
I've been trying to think of a "worst case" for this idea so we can try it
and measure it to see how it does. If it's not measurably worse than the
current approach, we can stop worrying about it!
I'm not sure if it is the worst, but I think a bad case is when we have
"a" matching 2,4,6,8,10,...,1000000 and "b" matching
1,3,5,7,9,...,1000001. Then a AND b has a on the lhs, and currently we
try check() on b for each position in a, but never find a match. With
your patch we calculate the weight for each position in a as well, so the
overhead is 500000 weight calculations. I suspect that's measurable
compared to the 500000 calls to next() on a and check() on b.
Does this seem a plausible testcase? Can you think of a worse case to
try?
(Not always being a win doesn't necessarily sink the idea of course, but
it means that it needs more careful consideration, and possibly a
heuristic for when to apply this optimisation and when not to - for
example, we could have a "cost" for each postlist, set higher for
!ExternalPostList, etc).
--
Ticket URL: <http://trac.xapian.org/ticket/378#comment:2>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list