[Xapian-tickets] [Xapian] #225: Spelling algorithm should consider frequency and not just edit-distance
Xapian
nobody at xapian.org
Tue Dec 5 05:12:30 GMT 2023
#225: Spelling algorithm should consider frequency and not just edit-distance
-----------------------------+-------------------------------
Reporter: Philip Neustrom | Owner: Olly Betts
Type: defect | Status: assigned
Priority: normal | Milestone: 2.0.0
Component: Library API | Version: git master
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-----------------------------+-------------------------------
Changes (by Olly Betts):
* milestone: 1.5.0 => 2.0.0
Old description:
> As described here:
> http://thread.gmane.org/gmane.comp.search.xapian.general/5740/focus=5743
>
> If the spelling correction algorithm considered frequency and edit-
> distance
> (using some reasonable heuristic) we would see dramatically better
> results.
> ~~The current spelling algorithm will only correct words that never
> appear in the
> spelling index.~~ ''(Since 1.2.3, it will offer a correction for a word
> when the correction has a higher frequency than the word)''
New description:
As described here:
http://thread.gmane.org/gmane.comp.search.xapian.general/5740/focus=5743
If the spelling correction algorithm considered frequency and edit-
distance
(using some reasonable heuristic) we would see ~~dramatically~~ better
results.
~~The current spelling algorithm will only correct words that never appear
in the
spelling index.~~ ''(Since 1.2.3, it will offer a correction for a word
when the correction has a higher frequency than the word)''
--
Comment:
It'd be really good to address the remainder of this, but it now doesn't
require a database format change so postponing.
--
Ticket URL: <https://trac.xapian.org/ticket/225#comment:17>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list