[Xapian-discuss] Filter similar results

Robby Walker robby.walker at gmail.com
Wed Sep 13 17:20:52 BST 2006


> > Specifically, what I have are a number of sets of documents, and I'd
> > like to get the best documents from *each* set.  So, making up some
> > relevance scores, say I have documents from set A with scores 10,9,8,7
> > and from set B with score 6,5.  If I am only to return 3 documents to
> > the user I'd like to return 10,9, and 6.  i.e. I'd like to filter out
> > the similar documents 8 and 7 from A.
>
> How are you choosing the numbers to return from each set? I can think
> of a variety of approaches which would result in that match set, but I
> can't be sure which one you mean :-)

Ideally I'd like to weight the number of results returned by each set
based on the number of results from that set.  So, if I am showing 10
results, and set A has 100 possible, B has 50, and C has 25 I'd like
to show the top 6 from A, top 3 from B, and top 1 from C.

Of course, I realize that this is probably *really* complicated so I'm
willing to settle for any reasonable approximation.

Thanks,
Robby



More information about the Xapian-discuss mailing list