[Xapian-discuss] GSoC 2012

Liam xapian at networkimprov.net
Fri Feb 24 08:35:10 GMT 2012


On Wed, Feb 22, 2012 at 6:25 PM, Olly Betts <olly at survex.com> wrote:

>  And it's better if the ideas don't involve extensive changes to the
> same parts of the code as each other, as that means a painful merge
> if both projects get done.  So we probably don't want one student
> working on indexing container formats while another works on using
> extraction libraries.  Perhaps one student could work on a project
> covering both though.
>

The Text Extraction Libraries project does of course overlap with the
Mime2Text library I've drafted. I'm not clear how much the former would
directly change omindex.cc. If significantly, I'd suggest that this project
be done as a branch of Mime2Text instead of on a new branch. I'm happy to
support that.

https://github.com/networkimprov/xapian/commits/liam_mime2text-lib

(BTW, it'd be great if you could take a deeper look at this sometime. I
posted my draft back in November...)

As for the Node.js binding, I'd suggest you add this line to the Support
Another Language project:

A basic Node.js binding exists but lacks many Xapian features. Extending it
requires learning the V8 & Node plugin APIs.
https://github.com/networkimprov/node-xapian

Liam


More information about the Xapian-discuss mailing list