<div dir="ltr">The termlists in Xapian contain a list of all the terms in a given document, so when a document is deleted these are used to update the postlists for all the relevant terms to remove the document id. The posting list lengths, etc, are all updated immediately (well, at the next commit). This contrasts with Lucene-style systems where the documents are just marked as deleted and garbage collected later.<div>
<br></div><div>-- </div><div>Richard</div><div class="gmail_extra"><br><br><div class="gmail_quote">On 1 March 2014 19:31, Matt Chaput <span dir="ltr"><<a href="mailto:matt@whoosh.ca" target="_blank">matt@whoosh.ca</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Just curious: How does Xapian clean up postings/words from deleted documents? Does it just remove them whenever a posting node is COWed in the Btree? Or is there some kind of periodic reaper function?<br>
<br>
Thanks!<br>
<br>
Matt<br>
<br>
<br>
_______________________________________________<br>
Xapian-devel mailing list<br>
<a href="mailto:Xapian-devel@lists.xapian.org">Xapian-devel@lists.xapian.org</a><br>
<a href="http://lists.xapian.org/mailman/listinfo/xapian-devel" target="_blank">http://lists.xapian.org/mailman/listinfo/xapian-devel</a><br>
</blockquote></div><br></div></div>