[Xapian-discuss] bigrams and co-occurrence matrix
☼ 林永忠 ☼ (Yung-chung Lin)
henearkrxern at gmail.com
Tue Oct 27 03:27:10 GMT 2009
Hi Ying,
You may check this http://code.google.com/p/cjk-tokenizer/
A perl binding is also included.
Best,
Yung-chung Lin
2009/10/26 Ying Liu <liux0395 at umn.edu>
> Hello all,
>
> I want to work out a solution to counting bigrams and creating a
> co-occurrence matix with Xapian Perl modules. By check archived emails,
> there are some discussions about CJK tokens. I am just working on English
> documents. My immediate goals are how Xapian do bigrams and how can it do
> that with windowing, like NSP does with the -- window option. Did anyone
> work on this before? Do you have some suggestions?
>
> Thank you,
> Ying
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
More information about the Xapian-discuss
mailing list