[Xapian-discuss] quartzcompact question

Olly Betts olly at survex.com
Mon May 2 23:54:10 BST 2005

On Mon, May 02, 2005 at 08:24:37PM -0400, info at bannershift.com wrote:
> I wanted to know how I can gain the best performance.
> I have thought about number of possibilities.
> I can do quartzcompact on a snapshot of xapian index before
> uploading it to a search engine.
> I can also do quartzcompact on an index that is constantly changes.
> My question what I a best alternative for a constant update
> and for a index that is used for search only.

I would suggest only compacting the database used for searching.
Compacting will definitely speed up the search.

Compacting tries to fill all the blocks in the B-tree as full as
possible - otherwise they'll typically be around 75% full, unless you've
been inserting sequentially (which you mostly will be if you just append
new documents to the end of the database, except for the postlist table)
in which case blocks usually end up 90+% full.

If you compact then update, updates will need to do a lot of block
splitting so will be slower for a while (until the database gets back to
a more typical state).

Or at least this is the reasoning behind the standard advice on full
compaction - I've never actually done timing experiments to verify this.
It sounds plausible, but it's crossed my mind before that the reduced
I/O required by the compact database might tip the balance the other way
at least sometimes.  So if you really want to know, try both ways (and
let us know how you get on!)

I suspect it's also worthwhile compacting if you've deleted a lot of
documents.  It'll eliminate the now unused blocks from the Btree tables
if nothing else.

0.9.0's quartzcompact will have a "non full compaction" option, and also a
"fuller compaction" option.  "Fuller compaction" squeezes a little extra
size off, but is definitely not recommended if you plan to update again.

0.9.0's quartz backend has also been tweaked to fill blocks slightly more
compactly in sequential mode too, so it might appear that quartzcompact
is less effective in 0.9.0 (in fact it's just that database are a bit
more compact to start with!)

(Hopefully I'll get 0.9.0 released this week.)


More information about the Xapian-discuss mailing list