[Xapian-devel] Add an example to the community page and contribute more code

Olly Betts olly at survex.com
Thu Jan 24 23:07:24 GMT 2013


On Wed, Jan 23, 2013 at 10:45:42AM +0530, aarsh shah wrote:
> Hi Olly :) I guess you are busy these days.

We have visitors staying at the moment, so I'm afraid I'm not online as
much as I typically am.  It sounds like you're making good progress
unaided though!

> Please can you just let me know about the  documentation standards
> and expectations that the community has.Want to document the stemmer code
> as nicely as I can :)

I'd recommend reading the advice in the "HACKING" document, which is in
the source tree in xapian-core/HACKING, but you can see it online too.
It's useful to look through all of it if you're working on the code, but
the part which is particular pertinent starts here:

http://trac.xapian.org/browser/trunk/xapian-core/HACKING#L1043

For a patch like this, there's not a lot of user documentation needed -
look to see where we say which stemmers we offer and update those
places.  It's an implementation on an existing algorithm, so a link to
wherever it is officially described would be useful.

For a new stemming algorithm, test coverage is quite important.  We want
to check that it implements the described algorithm, so any examples
from the description should definitely be in the test data.  Also make
sure each rule in the stemmer (assuming it is rule based) has at least
one example which exercises it in the test data.  It's also good to
stem the english word list we already have with the new stemmer and
include that, which helps to ensure it doesn't crash or hang on those
inputs, and that it continues to return the same results for them in
the future (which is useful even if those results haven't all been
checked by hand).

The data files for stemming tests live in xapian-data/stemming/ in
the source tree.

If there's one or more existing implementations available, then it's
useful to run the english word list through those too and compare the
results with what you get.

Cheers,
    Olly



More information about the Xapian-devel mailing list