[Xapian-tickets] [Xapian] #363: Avoid using the termlist when calculating percentages in get_mset()

Xapian nobody at xapian.org
Tue May 5 15:43:41 BST 2009


#363: Avoid using the termlist when calculating percentages in get_mset()
---------------------+------------------------------------------------------
 Reporter:  richard  |       Owner:  richard  
     Type:  defect   |      Status:  assigned 
 Priority:  normal   |   Milestone:  1.1.1    
Component:  Matcher  |     Version:  SVN trunk
 Severity:  normal   |    Keywords:           
Blockedby:           |    Platform:  All      
 Blocking:  181      |  
---------------------+------------------------------------------------------

Comment(by richard):

 One method for fixing this:

  * Keep a vector of all the weight objects generated when building the
 query.
  * Tell each leaf postlist (or weight object) the index of its weight
 object in this list.
  * Introduce a vector<bool> which is used to keep track of which weight
 objects have been used.
  * Clear this before calling get_weight() on the top postlist from
 get_mset().
  * Whenever a leaf postlist's get_weight() method is called, set the
 appropriate bit.
  * If the result is a new top-doc in get_mset(), copy its vector<bool> to
 a safe place.
  * Use the vectors to calculate the weight.

 Risk: clearing and setting the bits in the vector for every match might be
 overly CPU intensive.

-- 
Ticket URL: <http://trac.xapian.org/ticket/363#comment:2>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list