[Xapian-discuss] xapian supports Chinese language

Fabrice Colin fabrice.colin at gmail.com
Wed Apr 8 13:04:33 BST 2009


On Wed, Apr 8, 2009 17:08:31 +0800,  Li Yong <sdliyong at gmail.com> wrote:
> I want to use xapian to index chinese html pages.
>
> I found the cjk-tokenizer lib in the maillist
> http://lists.tartarus.org/pipermail/xapian-discuss/2007-June/003921.html
>
> However, I do not know how to add this lib to the xapian project.
>
> Is there any example or steps?
>
Pinot uses a slightly modified version of Yung-Chung Lin's
cjk-tokenizer that can be found at
http://svn.berlios.de/wsvn/dijon/trunk/cjkv/CJKVTokenizer.cc

For an example, see the XapianIndex and TokensIndexer classes at
http://svn.berlios.de/wsvn/pinot/trunk/IndexSearch/Xapian/XapianIndex.cpp

I hope this helps.

Fabrice



More information about the Xapian-discuss mailing list