[Xapian-tickets] [Xapian] #363: Avoid using the termlist when calculating percentages in get_mset()

Xapian nobody at xapian.org
Wed May 6 02:52:34 BST 2009


#363: Avoid using the termlist when calculating percentages in get_mset()
---------------------+------------------------------------------------------
 Reporter:  richard  |       Owner:  richard  
     Type:  defect   |      Status:  assigned 
 Priority:  normal   |   Milestone:  1.1.1    
Component:  Matcher  |     Version:  SVN trunk
 Severity:  normal   |    Keywords:           
Blockedby:           |    Platform:  All      
 Blocking:  181      |  
---------------------+------------------------------------------------------

Comment(by olly):

 Feel free to try it, but I think that's going to be measurably slower.

 On the plus side, note that this can potentially be optimised for AND
 queries with all leaf children - e.g. MultiAndPostList of terms knows for
 certain that all its sub-postlists will match, so no need to recurse them
 - we can store a mask to | in.

 I think we would probably want a specialised class rather than
 {{{vector<bool>}}} as we don't need dynamic resizing (just a runtime-
 specified fixed size, which rules out {{{bitset<>}}} which has a compile-
 time constant fixed size) but we do want the ability to efficiently do:
 {{{v1 |= v2;}}}

 But {{{vector<bool>}}} should be fine to prototype what sort of overhead
 there is here.

 Perhaps better, be lazy about this and only do it for a match which is the
 current best via a new method.  This has a bad worst case though - if the
 weights increase with docid it's even more work than the non-lazy approach
 as there are a lot of extra virtual method calls.

 This could be optional on a "want percentages" flag, but it would be nice
 to use this for "matching terms" too.  Which would require calculating and
 keeping these for anything in the proto-MSet...

 I don't think we want to rush into this, but we clearly can't ship 1.1.1
 with the testsuite failing assertions.  So I think it's better to find a
 quick fix for the synonym4 assertion failure, perhaps along the lines of
 the (non-working) patch I tried.

-- 
Ticket URL: <http://trac.xapian.org/ticket/363#comment:3>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list