[Xapian-discuss] Getting documents "like" arbitrary text?

Olly Betts olly at survex.com
Mon Jun 30 22:30:09 BST 2008


On Sun, Jun 29, 2008 at 04:56:16PM +0200, Ryan Shaw wrote:
> Because RSet.add_document takes a docid, it seems I must add my
> document to a database before I can include it in a relevance set. I
> don't really want to add the arbtrary input text to my index, though.
> Should I be going about this a different way?

Look at OP_ELITE_SET, which was added for this sort of thing.  Give
it all the terms from the document and it will pick the best N and make
them into an "OR" query.  It can of course be combined with other query
operators.

You should probably think of "best" as defined by outcome rather than
anything else, but currently it picks the terms with the highest max
termweight (as reported by the current weighting scheme).

Cheers,
    Olly



More information about the Xapian-discuss mailing list