[Xapian-tickets] [Xapian] #360: SynonymPostList always requires doclength if wdf is used

Xapian nobody at xapian.org
Fri Apr 24 01:58:21 BST 2009


#360: SynonymPostList always requires doclength if wdf is used
---------------------+------------------------------------------------------
 Reporter:  richard  |       Owner:  olly     
     Type:  defect   |      Status:  new      
 Priority:  normal   |   Milestone:  1.1.7    
Component:  Matcher  |     Version:  SVN trunk
 Severity:  minor    |   Blockedby:           
 Platform:  All      |    Blocking:           
---------------------+------------------------------------------------------
 SynonymPostList (in the opsynonym branch), currently clamps computed wdf
 values to the document length.  This is to ensure that the wdf does not
 exceed the document length, which is a condition that some weight schemes
 can rely on for computing tight bounds on the maximum weight.

 It would be good to avoid having to calculate the doclength for weighting
 schemes which don't require the doclength, but do require the wdf.  One
 approach for this would be to ensure that the wdf sum used in op synonym
 only counts each physical term once; though it is hard to do this
 duplicate removal in advance because query tree decay may remove some
 instances of a term being used while leaving others.

-- 
Ticket URL: <http://trac.xapian.org/ticket/360>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list