[Xapian-devel] Hi,
Olly Betts
olly at survex.com
Thu Aug 25 12:26:18 BST 2011
On Thu, Aug 25, 2011 at 04:01:30PM +0530, Aman (neshu) Agarwal wrote:
> If I am not wrong to add stemming I need to edit files in
> "xapian/xapian-core/languages" let say I want to add new stemming algorithm
> for english, so in that case I need to make changes in english.cc
If you're adding a new algorithm, create a new file (or files) for it.
There are actually 3 English stemming algorithms there already:
* english.cc generated from english.sbl, which is the Snowball English
stemmer
* porter.cc generated from porter.sbl, which is the Porter stemmer - an
older version of english.sbl, included for compatibility mostly
* lovins.cc generated from lovins.sbl, which Lovins' algorithm
If you're adding one written in Snowball (http://snowball.tartarus.org/)
then you'd add the .sbl file and update the list in
languages/Makefile.mk and the .cc version would be generated
automatically. If you're adding an algorithm coding by hand in C or C++
then you'd just add that file (and add it to languages/Makefile.mk too).
Then in stem.cc there's a big switch statement which determines which
algorithm to use when constructing a new Stem object, so you need to
hook up your new algorithm there.
> P.S. I compiled the developer version successfully
Cool. Is the solution something useful to share, or was it just a local
issue?
Cheers,
Olly
More information about the Xapian-devel
mailing list