[Xapian-discuss] Document and folder suggestions

Serkan Cabi cabi at MIT.EDU
Mon Jan 28 06:13:43 GMT 2008


Hello,
I didn't quite understand the second answer:
> Assuming you add a boolean term for the folder to each document in the
> database, run the given document as a query, and mark the top few
> results as relevant, then expand selecting only the folder terms.
What do you mean by "a boolean term"?
I now store the path by Document::set_data() and the file content by  
TermGenerator::index_text().

Thanks.
--
Serkan Cabi
MIT Center for Theoretical Physics


On Jan 27, 2008, at 6:01 PM, Olly Betts wrote:

> On Sun, Jan 27, 2008 at 04:01:54PM -0500, Serkan Cabi wrote:
>> Currently to find related documents I get a document, create a one
>> item rset, get eset of max size 100 of it and search those terms to
>> get a list of documents. Here is the code:
>
> I suspect 100 is too many.  Omega uses 40 for this (raised from 6  
> after
> someone reported that gave better results), but it's certainly worth
> experimenting.
>
>> 1) Is there a better way to get similar documents for a given  
>> document?
>
> You could take all the terms from the given document and combine them
> with OP_ELITE_SET to select the best discriminators and run a query  
> with
> those.  I'm not sure which would give better results, but they're  
> likely
> to involve a similar amount of work.
>
>> 2) Is there way to suggest a folder for a given document to be
>> classified in?
>
> Assuming you add a boolean term for the folder to each document in the
> database, run the given document as a query, and mark the top few
> results as relevant, then expand selecting only the folder terms.
>
> Cheers,
>    Olly




More information about the Xapian-discuss mailing list