[Xapian-discuss] Document and folder suggestions
Olly Betts
olly at survex.com
Sun Jan 27 23:01:09 GMT 2008
On Sun, Jan 27, 2008 at 04:01:54PM -0500, Serkan Cabi wrote:
> Currently to find related documents I get a document, create a one
> item rset, get eset of max size 100 of it and search those terms to
> get a list of documents. Here is the code:
I suspect 100 is too many. Omega uses 40 for this (raised from 6 after
someone reported that gave better results), but it's certainly worth
experimenting.
> 1) Is there a better way to get similar documents for a given document?
You could take all the terms from the given document and combine them
with OP_ELITE_SET to select the best discriminators and run a query with
those. I'm not sure which would give better results, but they're likely
to involve a similar amount of work.
> 2) Is there way to suggest a folder for a given document to be
> classified in?
Assuming you add a boolean term for the folder to each document in the
database, run the given document as a query, and mark the top few
results as relevant, then expand selecting only the folder terms.
Cheers,
Olly
More information about the Xapian-discuss
mailing list