[Xapian-discuss] Revision 11671 cursory observations wrt sort performance

Henry henka at cityweb.co.za
Sat Dec 6 16:21:39 GMT 2008


Quoting myself:
> b)  *Is* Xapian sorting through all 11-15k results above?  With
> performance an issue when sorting, I wonder:  I seem to vaguely recall
> an index search approach which roughly did the following:  since the
> user will only ever possibly view (say) 1000 results, why bother
> grinding through all 1 million results (or 10-15k in my tests above)
> to sort, etc?  ie, only gather and collate those results (say, 1000)
> with the highest scores (or those which have a particular 'field'
> above a certain threshold), discarding the rest, but still returning a
> "hit" total of X for display/informational purposes only... or is
> Xapian already doing this?

Apologies for answering myself:  if I understand the docs correctly,  
Enquire::get_mset() looks like the way to go.  It seems to estimate  
totals without considering all matches, but calling the decision  
function mdecider() for every match sounds expensive (in Perl at least).

I'll see tomorrow whether this is more efficient to use.

Regards
Henry



More information about the Xapian-discuss mailing list