[Xapian-discuss] bigrams and co-occurrence matrix

☼ 林永忠 ☼ (Yung-chung Lin) henearkrxern at gmail.com
Wed Oct 28 01:45:11 GMT 2009


Hi Ying,

You may download from libunicode from here:
http://ftp.gnome.org/pub/gnome/sources/libunicode/0.4/

Best,
Yung-chung Lin

2009/10/27 Ying Liu <liux0395 at umn.edu>

> Hi Yung-chung,
>
> Thanks for your reply. I download the cjk-tokenizer from CPAN at
> http://search.cpan.org/~xern/Lingua-CJK-Tokenizer-0.01/lib/Lingua/CJK/Tokenizer.pm<http://search.cpan.org/%7Exern/Lingua-CJK-Tokenizer-0.01/lib/Lingua/CJK/Tokenizer.pm>.
> It has a prerequisite libunicode by Tom Tromey. I don't find this module on
> CPAN. What should I install to make the cjk-tokenizer module work?
>
> Thanks,
> Ying
>
>
> ☼ 林永忠 ☼ (Yung-chung Lin) wrote:
>
>> Hi Ying,
>>
>> You may check this http://code.google.com/p/cjk-tokenizer/
>> A perl binding is also included.
>>
>> Best,
>> Yung-chung Lin
>>
>>
>> 2009/10/26 Ying Liu <liux0395 at umn.edu <mailto:liux0395 at umn.edu>>
>>
>>
>>    Hello all,
>>
>>    I want to work out a solution to counting bigrams and creating a
>>    co-occurrence matix with Xapian Perl modules. By check archived
>>    emails, there are some discussions about CJK tokens. I am just
>>    working on English documents. My immediate goals are how Xapian do
>>    bigrams and how can it do that with windowing, like NSP does with
>>    the -- window option. Did anyone work on this before? Do you have
>>    some suggestions?
>>
>>    Thank you,
>>    Ying
>>
>>
>>    _______________________________________________
>>    Xapian-discuss mailing list
>>    Xapian-discuss at lists.xapian.org
>>    <mailto:Xapian-discuss at lists.xapian.org>
>>
>>    http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>
>>
>>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>


More information about the Xapian-discuss mailing list