[Xapian-tickets] [Xapian] #378: Optimise MultiAndPostList using current weight of lhs.

Xapian nobody at xapian.org
Sat Jun 13 13:23:42 BST 2009


#378: Optimise MultiAndPostList using current weight of lhs.
-------------------------+--------------------------------------------------
 Reporter:  richard      |       Owner:  richard  
     Type:  enhancement  |      Status:  assigned 
 Priority:  normal       |   Milestone:  1.1.2    
Component:  Matcher      |     Version:  SVN trunk
 Severity:  normal       |    Keywords:           
Blockedby:               |    Platform:  All      
 Blocking:               |  
-------------------------+--------------------------------------------------

Comment(by olly):

 I've been trying to think of a "worst case" for this idea so we can try it
 and measure it to see how it does.  If it's not measurably worse than the
 current approach, we can stop worrying about it!

 I'm not sure if it is the worst, but I think a bad case is when we have
 "a" matching 2,4,6,8,10,...,1000000 and "b" matching
 1,3,5,7,9,...,1000001.  Then a AND b has a on the lhs, and currently we
 try check() on b for each position in a, but never find a match.  With
 your patch we calculate the weight for each position in a as well, so the
 overhead is 500000 weight calculations.  I suspect that's measurable
 compared to the 500000 calls to next() on a and check() on b.

 Does this seem a plausible testcase?  Can you think of a worse case to
 try?

 (Not always being a win doesn't necessarily sink the idea of course, but
 it means that it needs more careful consideration, and possibly a
 heuristic for when to apply this optimisation and when not to - for
 example, we could have a "cost" for each postlist, set higher for
 !ExternalPostList, etc).

-- 
Ticket URL: <http://trac.xapian.org/ticket/378#comment:2>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list