How to get the serialise score returned in Xapian::KeyMaker->operator().

张少华 xiangqianzsh at 163.com
Tue Jan 30 16:30:02 GMT 2018


> What's the relative speed difference you're seeing?
I have written a demo to compare the performance of PostingSource and KeyMaker. You can see the detail by this link.
https://github.com/xiangqianzsh/xapian_leaning/tree/master/compare_keymaker_and_postingsource


We generate 30 million documents, While searching, we first use one term (for example t1) to choose some documents, and then sort them descending using 0.5 * doc.get_value(1), i.e 0.5 * score.


By comparison, the time costed by PostingSource is 6 times greater than KeyMaker.
>If I follow, you're saying your query is just this an ExternalWeightPostingSource object?
When we use ExternalWeightPostingSource, we first use Query (const std::string &term, Xapian::termcount wqf=1, Xapian::termpos pos=0) to  choose some documents, and use Query (Xapian::PostingSource *source) to sort our documents. We join them together by Xapian::Query(Xapian::Query::OP_AND_MAYBE, query, query_extwps). 
We get the weight of our documents  by get_weight() function, but we cannot estimate the maximum of the weight if we don't check all the documents. So we set the max_weight a large number. If we can estimate the upper bound of get_weight() in our case, it may also not work well unless some documents' weight is exactly equals to the upper bound.  
For example, the upper bound is 1, and you want to get top10 documents,  but only 9 documents' weight is 1, so you  have to  check all the documents to choose the top10 documents. That's why I written 'a single PostingSource' in my last email.


More information about the Xapian-discuss mailing list