[Xapian-discuss] Faster "sort by value"

Arjen van der Meijden acmmailing at tweakers.net
Thu Jun 8 19:03:28 BST 2006


There is a nice speed-up when testing on our database. I copied our 
compacted flint database to a (much slower) test machine and compiled a 
normal and a patched Xapian 0.9.6 on it.

I ran two different queries through the two different sortings and also 
non-sorted. I didn't vary more parameters, nor tried different machines 
and database. I don't think that'll change the numbers much.

Here are the results for 'test' (115103 hits out of 1106781 documents):

The order of tests was 'fast no sort, fast sort, old no sort, old sort' 
and that in a loop of 5.

New no sort	sort		old no sort	old sort
0m1.342s	0m3.188s	0m0.162s	0m14.677s
0m0.125s	0m1.011s	0m0.099s	0m9.297s
0m0.160s	0m1.033s	0m0.100s	0m9.350s
0m0.134s	0m1.023s	0m0.102s	0m9.982s
0m0.117s	0m1.015s	0m0.098s	0m9.311s

Running only those two 'no sort'-variants shows they don't really differ 
in accumulated times. Perhaps the old-sorting-code pushes a bit of data 
out of memory?

And here the same for the query 'windows xp' (estimated 373626 hits out 
of 1106781).

New no sort	sort		old no sort	old sort
0m0.474s	0m2.164s	0m0.178s	0m7.292s
0m0.239s	0m2.141s	0m0.175s	0m8.050s
0m0.309s	0m2.864s	0m0.176s	0m7.764s
0m0.238s	0m2.209s	0m0.174s	0m19.005s
0m0.674s	0m3.812s	0m0.177s	0m20.963s
0m0.243s	0m2.793s	0m0.204s	0m12.303s
0m0.496s	0m2.237s	0m0.174s	0m9.791s
0m0.328s	0m2.338s	0m0.174s	0m10.961s
0m0.283s	0m2.125s	0m0.180s	0m9.394s
0m0.233s	0m2.107s	0m0.175s	0m11.360s

So there is a nice improvement with a factor varying from 3 to 9 orso.

But the value-sorting still takes about a factor 8-10 more than the 
relevance-sorted, so there might be a bit more room for improvement?

Best regards,

Arjen

On 6-6-2006 21:04, Olly Betts wrote:
> On Tue, Jun 06, 2006 at 08:50:37PM +0200, Arjen van der Meijden wrote:
>> I'll take a look at it somewhere this week.
> 
> Cool.
> 
>> We probably need to upgrade to 0.9.6 for this?
> 
> I imagine it'll apply cleanly to much older versions too, it's just
> a single hunk patch and I doubt that part of the code has changed in
> years (or I'd probably have noticed this before!)
> 
>> Do you have any idea how well that old "0.9.2 zlib"-patch will apply
>> to those sources? Or can you create a new one for me?
> 
> If it doesn't apply cleanly, let me know and I'll update it.  I suspect
> it'll be OK though.
> 
>> Btw, the url yields an HTTP 404 Not Found.
> 
> Sigh, that would be because I'm an idiot.  This is the correct URL:
> 
> http://www.oligarchy.co.uk/xapian/patches/xapian-faster-sort-by-value.patch
> 
> Cheers,
>     Olly
> 



More information about the Xapian-discuss mailing list