[Xapian-discuss] Clustering / Categorisation

Olly Betts olly at survex.com
Sun Nov 9 16:04:40 GMT 2008


On Tue, Nov 04, 2008 at 09:24:40AM +0200, Denis Kuzmenok wrote:
> I've tried to figure out how to work with clustering and
> categorisation, as i have understood i have to checkout svn branches
> of clustering and matchspy. But can't understand how to index
> documents properly, how to teach categorisation, how to get category,
> cluster, how to set weights for clustering, how to index all this from
> perl.

Please bear in mind that these are development branches.  They've not
been merged to trunk yet, and lack of complete documentation is often at
least part of the reason why.

"[I] can't understand how to index documents properly" is a very broad
(non-)question - I think you'll need to ask something more specific
about what aspect(s) of indexing you're failing to understand if you
want a useful answer.

The categorisation that matchspy offers isn't a "learning" classifier -
it just picks from pre-defined categories.

I don't know much about clustering - that's Richard's branch, and I've
barely looked at it yet.

The Perl XS wrappers are hand-written currently.  If anyone has written
extra wrappers for any of the branches yet, they've not contributed them
to us.

Cheers,
    Olly



More information about the Xapian-discuss mailing list