[Xapian-devel] [GSOC 2011] Participate

Olly Betts olly at survex.com
Thu Mar 24 11:20:57 GMT 2011


We decided to use the xapian-devel list for GSoC discussions, since GSoC
generates a lot of extra list traffic and we don't want to fill up the
mailboxes of everyone on xapian-discuss.  We've let them know that they
can subscribe to xapian-devel if they want to follow GSoC, but this way
they can choose not to easily.

So I'm replying to xapian-devel - please keep replies there too.

On Tue, Mar 22, 2011 at 08:01:37PM -0400, Prasad Prabhu wrote:
> I am interested in participating in open source project through Google
> Summer of Code 2011. I went through the idea list and I am keen on *improving
> spelling correction *as I have projects in algorithms and lingustics. Can
> anyone suggest me some reading material?

You could have a look at the current code.  Here's where we find a
spelling suggestion for a given word:

http://trac.xapian.org/browser/trunk/xapian-core/api/omdatabase.cc#L535

This is the edit distance algorithm we currently use:

http://berghel.net/publications/asm/asm.php

Here's an interesting algorithm someone (Dan I think) pointed out
recently, which might be useful:

http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levenshtein-Automata

Other than that, I'd suggest trying to find resources on the internet.

Cheers,
    Olly



More information about the Xapian-devel mailing list