[Xapian-discuss] Re: Spelling based on frequency and not just
distance
Philip Neustrom
philipn at gmail.com
Tue Jan 15 12:57:18 GMT 2008
The patch attached to this email is better than the previous. Hopefully
somebody can come up with something better entirely, as I'm not totally
happy with what I have -- it tends to suggest things like "plant" for
"plants" and then "plan" for "plant" :)
--Philip
On Jan 15, 2008 1:24 AM, Philip Neustrom < philipn at gmail.com> wrote:
> Hey all,
>
> After implementing the new spelling functionality on http://wikispot.org I
> noticed that terms like "wikipeda" weren't yielding spelling suggestions.
> Taking a quick look at the code, it looks like if we find an exact match,
> even if it has a frequency less than another match within the provided
> delta, we don't suggest anything. This is probably fine for sites with
> documents where you can be assured the data is properly spelled -- but not
> suitable for something like a wiki or the web in general.
>
> I did something simple, attached in a patch. Maybe someone has a better
> idea of how to weigh the different options, but my quick fix seemed to give
> much better results than the "give up on exact or edit-distance-closest
> match" code that was there already.
>
> --Philip Neustrom
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spelling_frequency.diff
Type: text/x-diff
Size: 2638 bytes
Desc: not available
Url : http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20080115/a09ce691/spelling_frequency.bin
More information about the Xapian-discuss
mailing list