[Xapian-devel] Cleaning the index

Olly Betts olly at survex.com
Sun Mar 2 06:20:54 GMT 2014

On Sat, Mar 01, 2014 at 08:33:49PM +0000, Richard Boulton wrote:
> The termlists in Xapian contain a list of all the terms in a given
> document, so when a document is deleted these are used to update the
> postlists for all the relevant terms to remove the document id. The posting
> list lengths, etc, are all updated immediately (well, at the next commit).
>  This contrasts with Lucene-style systems where the documents are just
> marked as deleted and garbage collected later.

Perhaps also worth noting that at some point we'll probably implement
the ability to choose either approach:



