[Xapian-discuss] New scws patch for xapian-core based on svn trunk

hightman hightman at twomice.net
Tue Apr 24 05:39:43 BST 2012


Hello, thanks.

I have created new patch file based on the snapshot package of xapian-core-1.3.x. it works well with scws or without scws.

Here also includes some optimization for CJK terms, such as removal of the stemmed record, conversion of multi-segmentation as synonym queries.

The patch file download URL: 
http://www.xunsearch.com/download/xapian-scws-1.3.x-snap.patch

1. Compile/install the scws library: 
    http://www.ftphp.com/scws/down/scws-1.2.0.tar.bz2
2. Extracting dictionary files into  'etc/' of scws install directory
    http://www.ftphp.com/scws/down/scws-dict-chs-utf8.tar.bz2
3. Patch & re-configure the xapian-core:
    patch -p1 < xapian-scws-1.3.x-snap.patch
    autoreconf
    rm -f queryparser/queryparser_internal.cc
    ./configure --with-scws=/usr/local/scws
    make

----- Simple test result -----

localhost:examples hightman$ ./simpleindex ./db
大家好,我是海鳗,来自中华人民共和国。
Hello, I am hightman and come from china.

localhost:examples hightman$ ./simpleindex ./db
我喜欢穿T恤,喜欢计算机和服务器

localhost:examples hightman$ ./simplesearch ./db 喜欢
Parsed query is: Query(喜欢@1)
1 results found.
Matches 1-1:

1: 0.569074 docid=2 [我喜欢穿T恤,喜欢计算机和服务器]

localhost:examples hightman$ ./simplesearch ./db chinas
Parsed query is: Query(Zchina at 1)
1 results found.
Matches 1-1:

1: 0.377177 docid=1 [大家好,我是海鳗,来自中华人民共和国。 Hello, I am hightman and come from china.]

localhost:examples hightman$ ./simplesearch ./db 中华人民
Parsed query is: Query((中华人民@1 SYNONYM (中华@89 OR 人民@90)))
1 results found.
Matches 1-1:

1: 0.121029 docid=1 [大家好,我是海鳗,来自中华人民共和国。 Hello, I am hightman and come from china.]

localhost:examples hightman$ ./simplesearch ./db 穿T恤
Parsed query is: Query((穿@1 OR t恤@2))
1 results found.
Matches 1-1:

1: 0.876681 docid=2 [我喜欢穿T恤,喜欢计算机和服务器]

在 2012-3-30,下午12:00, Olly Betts 写道:

> Sorry, I missed this mail before.
> 



More information about the Xapian-discuss mailing list