[Xapian-devel] about sort_by_value

Olly Betts olly at survex.com
Wed Mar 26 02:34:30 GMT 2014


On Wed, Mar 26, 2014 at 10:09:15AM +0800, 没有锂称 wrote:
> Hello, I have found that the use of sort_by_value very slow.
> 16800 result, return to the previous 10, sorting takes about 25ms. 

25ms is "very slow"?

> And if you do not sort, returns 10, need only about 0.3ms. 

It's not that sorting 10 results is taking 24.7ms - we need to find
the top 10 results when sorting by value, which is generally a different
10 to when sorting by relevance.

Part of the difference is that when sorting by relevance, Xapian has
various clever optimisations it can apply which reduce the number of
documents which need to be considered, but most of these aren't
applicable when sorting by value.  So really, it's sorting by relevance
that is very fast.

Another factor is that when sorting by value we need to actually
get the value for each matching document so that we can sort them, which
is extra data to read.  We may save from not reading the document
lengths, but that is much smaller than most values.

> How to make the sort faster?

You don't say what version of Xapian you're using, or what the values
are you're sorting on, so it's hard to give very specific advice.  But
a couple of general points:

You definitely want to use a backend with values streams (so chert or
newer, which means Xapian 1.2.x).

If you can store the values to sort on more compactly, that will help.
E.g. use Xapian::sortable_serialise() rather than an ASCII string like
"0001234.5678" if the sort keys are floating point numbers.

Cheers,
    Olly



More information about the Xapian-devel mailing list