[Xapian-discuss] xapian indexing size?

Olly Betts olly at survex.com
Thu May 5 21:18:39 BST 2005


On Thu, May 05, 2005 at 08:52:32PM +0100, James Aylett wrote:
> On Thu, May 05, 2005 at 07:14:19PM +0100, Olly Betts wrote:
> 
> > Currently document data is stored uncompressed (I have patches to
> > use zlib I'll be integrating soon)
> 
> I assume this is a patch to Xapian, and not to simpleindex :-)

Erm, yes.  I mean the patches I posted to the list a while back
(probably to xapian-devel actually).

Once 0.9.0 is out of the door, my plan is to make a new backend by
copying the quartz backend, rename the copy (to "flint"), then rewrite
bits of it.  So if you want a stable backend, you can use the quartz
backend.  If you don't mind rebuilding databases after every release,
you can try out flint.  Once the changes have settled down, we can
switch over.

I'm planning to merge the zlib changes into flint to avoid creating
a new quartz database version.

I've got most of the new design sketched out now, although inevitably
there's some fluidity as you can't always see what will work best
without trying it for real.

The first change is likely to be to specialise the structure of the
branch blocks in the B-tree.  Currently they have the same structure
as leaf blocks, but a more compact structure is possible, which would
allow us to fit more keys in a branch block.  That's good in itself,
but also reduces the number of branch levels required which reduces
the number of block reads needed to retrieve the tag for a given key.

Cheers,
    Olly



More information about the Xapian-discuss mailing list