[Xapian-devel] Adding support for phonetic correction via soundex algorithms

James Aylett james-xapian at tartarus.org
Tue Dec 30 21:24:19 GMT 2014


On 30 Dec 2014, at 07:27, Shubham Sharma <shubham8742 at gmail.com> wrote:

> I'm Shubham Sharma, a 3rd year Information Systems student at BITS Pilani, and I'd like to explore the possibilities of a GSOC Project that would aim at improving the tolerant retrieval capabilities  via phonetic correction(: misspellings that arise because the user types a query that sounds like the target term. ) I did find spelling correction via edit distance in xapian-core/matcher but i dont think any phonetic correction algorithms have been implemented . Could this be a possible GSOC project?  

Shubham — hi, thanks for getting in touch!

For the spelling correction side you'll want to consider both implementing a phonetic similarity comparison (along with justification for your choice) and a way of choosing between edit distance and phonetic approaches. You’ll also want to come up with a way of measuring how good each is, or at least for providing people with a way of deciding which approach is more suitable for the system they’re building using Xapian.

Unless it turns out to be more complex than I’d anticipate, that doesn’t sound like a project that would occupy you for the whole period of GSoC. It could however be combined with something else to make a good application.

J

-- 
 James Aylett, occasional trouble-maker
 xapian.org




More information about the Xapian-devel mailing list