[Xapian-devel] Index Size comparison

David Jeske davidj at gmail.com
Wed May 2 18:14:09 BST 2012

On Wed, May 2, 2012 at 6:28 AM, Olly Betts <olly at survex.com> wrote:

> My understanding is that Lucene doesn't store [a list of all terms in each
> document], and handles deletion by adding the document id to a "deleted"
> list, which has to be excluded from query results;

Yes, though these entries get cleaned up during merge/optimize, so there
isn't really a cumulative error like you implied. (i.e. whenever you scan
over all terms it's easy to remove terms for items in the "deleted" list)
