[Xapian-discuss] Testing document size preallocation.

Olly Betts olly at survex.com
Mon Jan 9 04:10:20 GMT 2012


On Sun, Jan 08, 2012 at 12:32:10PM -0900, Shane Spencer wrote:
> https://gist.github.com/ad2accc5b4655753923d
> 
> So here I am creating a database with no values for each small
> document and one with a bunch of blank values (uuid_blank).  Once
> those are flushed then I reopen them and start replacing the documents
> of each with identical documents that have an identical large set of
> values.  I am using replace_document and a specific document ID.
> 
> Is there a specific problem that I'm up against that shows that
> preallocation is up to 2 times slower for replacing an identically
> sized document rather than adding to its final serialized size?

To allow readers to continue to use the database while a writer is
working, Xapian uses copy-on-write.  So "preallocating" the documents
like you are doesn't cause Xapian to reuse the same space the document
is already in, but instead allocates new space and then marks the old
space as available for reuse after the new revision.

So it's inevitably going to be slower because there's more work to do.

Cheers,
    Olly



More information about the Xapian-discuss mailing list