[Xapian-discuss] problem on closing writable databases

Markus Wörle mrks at mrks.de
Fri Feb 20 15:33:05 GMT 2009


Am 20.02.2009 um 14:18 schrieb Olly Betts:

> On Tue, Feb 03, 2009 at 05:56:15PM +0100, Markus Wörle wrote:
>> because of the issue, that xapian btrees thin out in the longrun
>
> I'm not sure I follow.  Do you just mean that if you delete a lot of
> documents, you don't immediately get the space back?  That's certainly
> true, but if you index more documents, that space should get reused.
> If the Btrees really end up becoming less efficiently used over time,
> I suspect that means there's a bug somewhere.

I mentioned that formerly in "[Xapian-discuss] weak populated b- 
trees?" on this list.

Your answer was:
> We don't currently ever merge under-populated blocks, unless they  
> become
> totally empty.  We also don't ever shrink the file, even if blocks at
> the end become free.
>
> (...)
>
> For now I'd just compact the database after a lot of deletion if you
> don't plan to add back in a similar amount of data.  Even if you're
> planning to update further, the penalty for the extra block splitting
> that will be needed for a while afterwards is likely to be minor.

This is what I'm actually doing. Currently, I compact my indexes by  
hand, about every 6 month.

I am not sure whether this is a bug or not. I constantly add/remove/ 
replace (with replace_document_by_term)  a huge amount of documents. I  
occasionally even replace all documents in an index. Whenever I do  
that, the index grows slightly, becomes slow over time, wastes RAM  
(cached diskblocks), etc. After compacting it all starts from the  
beginning.

> SVN trunk adds a Database::close() method, but that's probably not a
> great help to you currently.

James already mentioned that. Its okay for me to know that it will be  
possible in future releases. I'm just going to compact my indices by  
hand in the meantime.

Thanks!

mrks


More information about the Xapian-discuss mailing list