[Xapian-tickets] [Xapian] #344: Allow calculation of percentages to be disabled

Xapian nobody at xapian.org
Fri Mar 13 17:02:37 GMT 2009


#344: Allow calculation of percentages to be disabled
---------------------+------------------------------------------------------
 Reporter:  richard  |       Owner:  olly     
     Type:  defect   |      Status:  new      
 Priority:  normal   |   Milestone:  1.1.1    
Component:  Other    |     Version:  SVN trunk
 Severity:  normal   |   Blockedby:           
 Platform:  All      |    Blocking:           
---------------------+------------------------------------------------------
 Currently, all calls to get_mset() calculate percentage weights for each
 document.  This has a measurable overhead, and percentages are often not
 needed.  Therefore, it would be nice to be able to disable calculation of
 percentages for a match (and possibly even for the calculation to be
 disabled by default).  Alternatively, if we can reduce the overhead to a
 very small amount (eg, less than 1%) it would probably be reasonable to
 continue calculating it in all cases, for the added convenience of not
 needing to enable it before searches.

 I've just been examining the performance of a set of 10 term OR searches.
 According to kcachegrind, around 5.5% of the CPU time is spent at the end
 of get_mset() in reading the termlist of the top document; this is done
 only to check which terms are present in the top document, in order to
 calculate the percentage for that document.  Therefore, the current
 overhead for these searches is at least 5.5% of the search time (when
 we're not IO bound).

 There is a patch attached to ticket #216
 (http://trac.xapian.org/attachment/ticket/216/calcpercent.patch) which
 adds this feature (though it may need updating to match SVN trunk).
 However, it seemed to have been ignored/forgotten, so I think it deserves
 a ticket to discuss it.

-- 
Ticket URL: <http://trac.xapian.org/ticket/344>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list