Xapian 1.3.5 snapshot performance and index size

Jean-Francois Dockes jf at dockes.org
Tue Apr 12 10:28:52 BST 2016


Olly Betts writes:
 > On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote:
 > > The question which remains for me is if I should run xapian-compact
 > > after an initial indexing operation. I guess that this depends on the
 > > amount of expected updates and that there is no easy answer ?
 > 
 > I think it's not obvious whether it's a good plan to or not.
 > 
 > Ideally we'd find a way to make it come out more compact to start with.
 > 
 > One thing which could help is making glass more willing to switch to
 > "sequential mode".  If you fancy some more benchmarking, you could
 > try changing SEQ_START_POINT in backends/glass/glass_table.cc.
 > 
 > It defaults to -10, but I don't think anyone has tried tuning it
 > recently (this value comes from Martin's original code in commit
 > 26bd647ff6084c60d8869f27d6abbd99e06c3f30 back in 2000 - he may have done
 > tests to select it, but even if he did, so much has changed since).
 > Something like -3 or -4 might work well - probably enough that it
 > shouldn't enable when it's not useful, and by default we ensure at least
 > 4 items fit in a block.

Ok, I tried this, with not much luck.

I used a script to edit the SEQ_START_POINT value, then rebuild and
install Xapian, then run the indexing.

Sizes don't change much... Maybe I did something wrong, 

https://gist.github.com/medoc92/1ad2a232e4b36e2993ce9adc5789a60a

The output follows (I edited out the unchanging recoll config dumps).

Jf


*******LIB*****************
Tue Apr 12 10:43:14 CEST 2016
#define SEQ_START_POINT (-10)
-rwxr-xr-x 1 root root 30728315 Apr 12 10:43 /usr/lib/libxapian-1.3.so.6
*************************
452.68user 124.94system 4:42.27elapsed 204%CPU (0avgtext+0avgdata 1055204maxresident)k
0inputs+21046192outputs (0major+41137071minor)pagefaults 0swaps
*************************
793244	/home/dockes/.recoll/xapiandb
total 793240
-rw-r--r-- 1 dockes dockes  24150016 Apr 12 10:47 docdata.glass
-rw-r--r-- 1 dockes dockes         0 Apr 12 10:47 flintlock
-rw-r--r-- 1 dockes dockes       130 Apr 12 10:47 iamglass
-rw-r--r-- 1 dockes dockes 577527808 Apr 12 10:47 position.glass
-rw-r--r-- 1 dockes dockes 120905728 Apr 12 10:47 postlist.glass
-rw-r--r-- 1 dockes dockes  89677824 Apr 12 10:47 termlist.glass
*************************

*******LIB*****************
Tue Apr 12 10:48:04 CEST 2016
#define SEQ_START_POINT (-7)
-rwxr-xr-x 1 root root 30728315 Apr 12 10:48 /usr/lib/libxapian-1.3.so.6
*************************
449.64user 124.36system 4:48.82elapsed 198%CPU (0avgtext+0avgdata 1074832maxresident)k
8inputs+22874712outputs (0major+41448062minor)pagefaults 0swaps
*************************
791324	/home/dockes/.recoll/xapiandb
total 791320
-rw-r--r-- 1 dockes dockes  24141824 Apr 12 10:52 docdata.glass
-rw-r--r-- 1 dockes dockes         0 Apr 12 10:52 flintlock
-rw-r--r-- 1 dockes dockes       130 Apr 12 10:52 iamglass
-rw-r--r-- 1 dockes dockes 577921024 Apr 12 10:52 position.glass
-rw-r--r-- 1 dockes dockes 119078912 Apr 12 10:52 postlist.glass
-rw-r--r-- 1 dockes dockes  89153536 Apr 12 10:52 termlist.glass
*************************

*******LIB*****************
Tue Apr 12 10:53:00 CEST 2016
#define SEQ_START_POINT (-4)
-rwxr-xr-x 1 root root 30728315 Apr 12 10:52 /usr/lib/libxapian-1.3.so.6
*************************
451.16user 128.46system 5:35.34elapsed 172%CPU (0avgtext+0avgdata 1060184maxresident)k
16inputs+24076448outputs (0major+41924101minor)pagefaults 0swaps
*************************
789020	/home/dockes/.recoll/xapiandb
total 789016
-rw-r--r-- 1 dockes dockes  24150016 Apr 12 10:58 docdata.glass
-rw-r--r-- 1 dockes dockes         0 Apr 12 10:58 flintlock
-rw-r--r-- 1 dockes dockes       130 Apr 12 10:58 iamglass
-rw-r--r-- 1 dockes dockes 578453504 Apr 12 10:58 position.glass
-rw-r--r-- 1 dockes dockes 115941376 Apr 12 10:58 postlist.glass
-rw-r--r-- 1 dockes dockes  89391104 Apr 12 10:58 termlist.glass
*************************

*******LIB*****************
Tue Apr 12 10:58:43 CEST 2016
#define SEQ_START_POINT (-3)
-rwxr-xr-x 1 root root 30728315 Apr 12 10:58 /usr/lib/libxapian-1.3.so.6
*************************
458.04user 125.02system 5:18.14elapsed 183%CPU (0avgtext+0avgdata 1048328maxresident)k
0inputs+22002000outputs (0major+40947584minor)pagefaults 0swaps
*************************
786756	/home/dockes/.recoll/xapiandb
total 786752
-rw-r--r-- 1 dockes dockes  24150016 Apr 12 11:03 docdata.glass
-rw-r--r-- 1 dockes dockes         0 Apr 12 11:04 flintlock
-rw-r--r-- 1 dockes dockes       130 Apr 12 11:04 iamglass
-rw-r--r-- 1 dockes dockes 577871872 Apr 12 11:03 position.glass
-rw-r--r-- 1 dockes dockes 114171904 Apr 12 11:04 postlist.glass
-rw-r--r-- 1 dockes dockes  89423872 Apr 12 11:03 termlist.glass
*************************

*******LIB*****************
Tue Apr 12 11:04:08 CEST 2016
#define SEQ_START_POINT (-2)
-rwxr-xr-x 1 root root 30728315 Apr 12 11:04 /usr/lib/libxapian-1.3.so.6
*************************
452.14user 122.41system 4:55.79elapsed 194%CPU (0avgtext+0avgdata 1060256maxresident)k
40inputs+22850200outputs (0major+38276837minor)pagefaults 0swaps
*************************
784960	/home/dockes/.recoll/xapiandb
total 784956
-rw-r--r-- 1 dockes dockes  24141824 Apr 12 11:09 docdata.glass
-rw-r--r-- 1 dockes dockes         0 Apr 12 11:09 flintlock
-rw-r--r-- 1 dockes dockes       130 Apr 12 11:09 iamglass
-rw-r--r-- 1 dockes dockes 578920448 Apr 12 11:09 position.glass
-rw-r--r-- 1 dockes dockes 111460352 Apr 12 11:09 postlist.glass
-rw-r--r-- 1 dockes dockes  89251840 Apr 12 11:09 termlist.glass
*************************






More information about the Xapian-discuss mailing list