[Xapian-devel] contribution to "Add more stemming algorithms"

Hurricane Tong zhangshangtong.cpp at qq.com
Tue Feb 18 14:08:20 GMT 2014


Hi,


I am trying to contribute to the "bite-site" project, "Add more stemming algorithms".
I implement the Lancaster (Paice/Husk) stemming algorithm by building a class named StemLancaster extending 
the StemImplementation, with the guide in http://www.comp.lancs.ac.uk/computing/research/stemming/index.htm.
I think this class can be added to the default API for the potential users who are interested in this algorithm.
There is the source code, https://github.com/HurricaneTong/Xapian, would you like to give me some suggestions about the source code, and can this code be added to the source code of Xapian after necessary modifying ?


Besides, I indexed about 5000 documents from wikipedia with Brass and Chert, 
and execute about 40000 single term search.
With the brass database, it costs 5.66s, and with the chert database, it costs 5.57s, ( In virtual machine VBox ). it seems that brass is slower in this condition.


------------------
HurricaneTong,Second Year Undergraduate,
School of Computer Science,
Fudan University, China.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140218/09c67db7/attachment.html>


More information about the Xapian-devel mailing list