[Xapian-tickets] [Xapian] #225: Spelling algorithm should consider frequency and not just edit-distance

Xapian nobody at xapian.org
Tue Jul 7 17:59:29 BST 2009


#225: Spelling algorithm should consider frequency and not just edit-distance
-------------------------+--------------------------------------------------
 Reporter:  philipn      |        Owner:  olly     
     Type:  defect       |       Status:  assigned 
 Priority:  low          |    Milestone:  1.1.4    
Component:  Library API  |      Version:  SVN trunk
 Severity:  normal       |   Resolution:           
 Keywords:               |    Blockedby:           
 Platform:  All          |     Blocking:           
-------------------------+--------------------------------------------------
Changes (by olly):

  * priority:  normal => low


Comment:

 My main reservation here is that the algorithm seems rather arbitrary -
 it's more satisfactory to have a mathematical model or other
 justification.  I worry if one term indexes most documents it might get
 suggested too often.  It also seems this might require significantly more
 work, but it might not make a measurable difference even if it does, and
 better suggestions are worth some extra work.

 Perhaps we should consider suggestions which require only one (or perhaps
 two) extra edits - still arbitrary, but at least the behaviour is more
 constrained.  And this would allow us to still cull a lot of entries which
 can't have a low enough edit distance.

 Setting priority low for now.

-- 
Ticket URL: <http://trac.xapian.org/ticket/225#comment:6>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list