[Xapian-discuss] Flint Backend

Arjen van der Meijden acmmailing at tweakers.net
Tue Jun 28 20:32:46 BST 2005


Here is the output of quartzcompact:
postlist: Reduced by 0.537767% 7504K (1395400K -> 1387896K)
record: Reduced by 0.311498% 544K (174640K -> 174096K)
termlist: Reduced by 0.423684% 5168K (1219776K -> 1214608K)
position: Reduced by 0.0208535% 1512K (7250576K -> 7249064K)
value: Reduced by 0.0307314% 16K (52064K -> 52048K)

I started from the "quartzcompact 0.8.4 + zlib", above is quartzcompact 
0.9.1-svn -F + zlib.

At the moment I'm running benchmarks on all the databases I created 
earlier. When I've the results, I'll send them to the list as well.

Best regards,

Arjen

On 27-6-2005 0:45, Olly Betts wrote:
> On Sun, Jun 26, 2005 at 10:43:32AM +0200, Arjen van der Meijden wrote:
> 
>>         Qz          Qz 084 gz    Qz -nF gz
>>Position 7424589824  7424589824  7432200192
>>Postlist 1708957696  1428889600  1535426560
>>Record    254222336   178831360   179888128
>>Termlist 1770250240  1249050624  1395597312
>>Value      61317120    53313536    53313536
> 
> 
> I think "-nF" is probably larger because of the "-n".  Can you try with
> just "-F"?
> 
> 
>>Here the xapian-compact results of the flint database. Here -n -F and -F 
>>produced exactly the same table sizes but they were smaller than the 
>>original compaction-try. Please do note the position-table is larger 
>>than in the quartz compacted-cases.
>>
>>         Flint       Flint -nF/-F
>>Position 7452794880  7451574272
>>Postlist 1644240896  1634279424
>>Record    255377408   254418944
>>Termlist 1772339200  1764106240
>>Value      62177280    62177280
> 
> 
> OK, so comparing against the non-zlib, we're a bit better for postlist,
> and a bit worse for record/termlist/value.  I suspect that's mostly
> down to the longer keys, which will be resolved when I replace the Btree
> manager (I'm going to make the key compare a virtual function which can
> be different for each table, rather than having to encode the keys in
> such a way that the byte contents compare in the desired order).
> 
> It's a shame that the new position table encoding isn't smaller for you.
> I think I might need to look at your data at some point, but I'll try
> some more examples locally first in case it's the one I've been using
> which is atypical.

I had some very unexpected results with the position-tables of the 
various quartz-databases. The uncompacted version was 50% *faster* than 
the compacted ones. I've changed the benchmarking, hoping it was some 
issue with how it was layed out on disk.

I'll have to investigate it a bit more probably, depending on the out 
come of the current benchmarks.

>>Is reading from the working, instead of the compacted database a 
>>cause?
> 
> Almost certainly - there's probably less to read (though bear in mind
> that the working database will have a number of blocks which aren't
> in use in the current version and these don't need to be read to copy
> it), but more to the point a database which is compact with revision
> 1 (like that quartzcompact and xapian-compact produce) is more efficient
> to read and iterate over.

I'll test this on the not-loaded-machine as well sometime soon.

Best regards,

Arjen



More information about the Xapian-discuss mailing list