[Xapian-devel] Integrating a PaiceHusk stemmer into the library

Olly Betts olly at survex.com
Thu Jan 24 22:39:36 GMT 2013


On Thu, Jan 24, 2013 at 10:26:31AM +0530, aarsh shah wrote:
> I've implemented a PaiceHusk stemmer externally  So what I
> am doing right now is passing a pointer to my StemPaiceHusk class(which in
> turn has been subclassed from Stemimplementation) to the
> Stem::Stem(StemImplementation *p) constructor .So basically,I have to
> include "paicehusk.h" in my indexer  .However,I now want to make it a part
> of the Xapian library so that I can simply include <xapian.h> in my indexer
> and use something like Xapian::Stem("paicehusk") to use the stemmer.(as we
> do for the inbuilt snowball stemmers.)

OK.

> -> I tried doing in a lot of ways including adding:-
> 
> #include "paicehusk.h" in stem.cc (paicehusk.h and stem.cc are in the same
> directory)  and then the following code in Stem::Stem(const string
> &language) constructor.
> 
> (if language=="paicehusk")
>                internal(new StemPaiceHusk) and also internal=new
> StemPaiceHusk;

The second version should do the trick.

> ->But it's not working.I get an error when I make the library then.Please
> can you help me out here as this will also help me in integrating further
> changes in the library that I plan to do now.

You really need to tell us the error message for us to be able to help.
It looks like from IRC that you've now solved this though (and the
problem was that you hadn't added paicehusk.cc to the library).

I had a pending change to how we dispatch language names which I hadn't
committed - I've now committed it, so the code there is simpler, but
you'll need to tweak your change slightly.  I think you can just add
this as the first line of paicehusk.cc:

// Alias: paicehusk

And then append paicehusk.cc to the command in this rule in Makefile.mk:

languages/sbl-dispatch.h: languages/collate-sbl languages/Makefile.mk
        $(PERL) '$(srcdir)/languages/collate-sbl' '$(srcdir)' $(snowball_algorithms)

Then in stem.cc, you'll just have:

    case PAICEHUSK:
        internal = new StemPaiceHusk;
        return;

Cheers,
    Olly



More information about the Xapian-devel mailing list