[Xapian-discuss] bigrams and co-occurrence matrix
☼ 林永忠 ☼ (Yung-chung Lin)
henearkrxern at gmail.com
Wed Oct 28 01:45:11 GMT 2009
Hi Ying,
You may download from libunicode from here:
http://ftp.gnome.org/pub/gnome/sources/libunicode/0.4/
Best,
Yung-chung Lin
2009/10/27 Ying Liu <liux0395 at umn.edu>
> Hi Yung-chung,
>
> Thanks for your reply. I download the cjk-tokenizer from CPAN at
> http://search.cpan.org/~xern/Lingua-CJK-Tokenizer-0.01/lib/Lingua/CJK/Tokenizer.pm<http://search.cpan.org/%7Exern/Lingua-CJK-Tokenizer-0.01/lib/Lingua/CJK/Tokenizer.pm>.
> It has a prerequisite libunicode by Tom Tromey. I don't find this module on
> CPAN. What should I install to make the cjk-tokenizer module work?
>
> Thanks,
> Ying
>
>
> ☼ 林永忠 ☼ (Yung-chung Lin) wrote:
>
>> Hi Ying,
>>
>> You may check this http://code.google.com/p/cjk-tokenizer/
>> A perl binding is also included.
>>
>> Best,
>> Yung-chung Lin
>>
>>
>> 2009/10/26 Ying Liu <liux0395 at umn.edu <mailto:liux0395 at umn.edu>>
>>
>>
>> Hello all,
>>
>> I want to work out a solution to counting bigrams and creating a
>> co-occurrence matix with Xapian Perl modules. By check archived
>> emails, there are some discussions about CJK tokens. I am just
>> working on English documents. My immediate goals are how Xapian do
>> bigrams and how can it do that with windowing, like NSP does with
>> the -- window option. Did anyone work on this before? Do you have
>> some suggestions?
>>
>> Thank you,
>> Ying
>>
>>
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> <mailto:Xapian-discuss at lists.xapian.org>
>>
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>
>>
>>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
More information about the Xapian-discuss
mailing list