[Xapian-devel] Document clustering module?

☼ 林永忠 ☼ (Yung-chung Lin) henearkrxern at gmail.com
Mon Sep 17 15:54:55 BST 2007

Yes, it's true. Focusing on a simple and plain design is the way it should be.

Yung-chung Lin

On 9/17/07, Olly Betts <olly at survex.com> wrote:
> If you're talking about grouping collapsed documents, that should
> probably happen during the match process, like collapse does.  Don't
> worry too much about that idea - let's focus on the clustering part
> for now, and just bear in mind how it might be reused for this (or
> perhaps this problem is too different).
> If you're not talking about that, there needs to be a clustering
> algorithm specified for this to work.
> I wouldn't get too fancy initially - we don't want to produce an
> elaborate API which we think does everything conceivable, only to
> discover a better approach or something it can't nicely do, and then
> have to choose between keeping the sub-optimal API we have, or the pain
> of deprecation and transition.
> Let's just go with tagging each MSet entry with a cluster id for now.
> That seems a good starting point, and everything which has been
> suggested so far can either be built on top of that, or provide that as
> a side-effect.
> And that should allow us to get clustering functionality into a release
> sooner.
> Cheers,
>     Olly

More information about the Xapian-devel mailing list