[Xapian-tickets] [Xapian] #225: Spelling algorithm should consider frequency and not just edit-distance
Xapian
nobody at xapian.org
Tue Jul 7 17:59:29 BST 2009
#225: Spelling algorithm should consider frequency and not just edit-distance
-------------------------+--------------------------------------------------
Reporter: philipn | Owner: olly
Type: defect | Status: assigned
Priority: low | Milestone: 1.1.4
Component: Library API | Version: SVN trunk
Severity: normal | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
Changes (by olly):
* priority: normal => low
Comment:
My main reservation here is that the algorithm seems rather arbitrary -
it's more satisfactory to have a mathematical model or other
justification. I worry if one term indexes most documents it might get
suggested too often. It also seems this might require significantly more
work, but it might not make a measurable difference even if it does, and
better suggestions are worth some extra work.
Perhaps we should consider suggestions which require only one (or perhaps
two) extra edits - still arbitrary, but at least the behaviour is more
constrained. And this would allow us to still cull a lot of entries which
can't have a low enough edit distance.
Setting priority low for now.
--
Ticket URL: <http://trac.xapian.org/ticket/225#comment:6>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list