[Xapian-devel] Last minute feature for 1.0.0

Olly Betts olly at survex.com
Fri May 11 05:35:22 BST 2007


On Thu, May 10, 2007 at 12:58:00PM +0100, James Aylett wrote:
> On Sat, May 05, 2007 at 12:12:49AM +0100, Richard Boulton wrote:
> 
> > Don't bite my head off, but I have one final request about this patch. 
> > Could we instead apply the minimal patch in attachment 79 (at
> > http://www.xapian.org/cgi-bin/bugzilla/attachment.cgi?id=79&action=view
> > ) which simply changes the format used to store the lastdocid value from 
> >  pack_uint_last() to pack_uint(), (and changes the unpack code 
> > correspondingly).  This would allow the metadata patches to be applied 
> > in the 1.0 series without breaking existing databases, and is such a 
> > small patch that I think the chances of it introducing new bugs are 
> > pretty small.
> 
> This makes sense to me. If we don't do it, either we have to make
> another BC break in future (since I can think of several other things
> we might want to put in the db metadata), or we'd have to have some
> nasty auto-detection auto-upgrading code, which seems the wrong
> approach given this patch is so small, even up against 1.0.0.

The minimal patch itself seems safe, but I think that the approach is
suboptimal.  I only had a quick look, but the full patch seems to be
serialising a load of (key,tag) pairs into a blob of data which gets
tacked on the end of the metainfo tag.  So we'll need to fetch it
every time we open the database, whether it's wanted or not.

My point is that we have a handy Btree manager, whose entire purpose is
to store (key,tag) pairs!  Wouldn't it be better to store this versioned
user data using that?  Then if someone wants to store multi-KB (or
even multi-MB) tags, it's really no problem.  And they can efficiently
retrieve one piece of data without having to fetch all the others.  We
should be able to slot such data into "impossible" keys in the postlist
table I think.

Cheers,
    Olly



More information about the Xapian-devel mailing list