[Xapian-tickets] [Xapian] #742: Xapian should provide a way to securely remove a document from the database

Xapian nobody at xapian.org
Thu Jun 7 04:45:45 BST 2018


#742: Xapian should provide a way to securely remove a document from the database
---------------------------+--------------------------
 Reporter:  dkg            |             Owner:  olly
     Type:  defect         |            Status:  new
 Priority:  normal         |         Milestone:  1.5.0
Component:  Backend-Honey  |           Version:
 Severity:  normal         |        Resolution:
 Keywords:                 |        Blocked By:
 Blocking:                 |  Operating System:  All
---------------------------+--------------------------
Changes (by olly):

 * component:  Other => Backend-Honey
 * milestone:   => 1.5.0


Comment:

 > Some of my plans for the next backend would probably help here too (I'm
 thinking that mass updates would get applied in a ​LSMT-like manner - that
 doesn't given an identical DB for a given doc set, but it should reduce
 the variance significantly).

 The next backend ("honey") is progressing, and it looks like it will help
 a lot with concerns about lingering traces of removed documents - it gets
 us very close to database reproducibility (if the index for the table gets
 fully regenerated, we should be entirely reproducible; I'm not entirely
 sure about the details of incrementally updating that index).

 Instead of a B-tree, it uses (though this isn't all finished yet) a sorted
 string table structure, with multiple levels which are merged together
 with a rolling merge.  That essentially means that for no extra work we
 get a limited lifetime for removed documents living on - it's only until
 the rolling merge process next gets to them, because the SSTable is just
 all the (key,value) pairs in the table serialised contiguously in
 ascending key order.  And the rolling merge would probably just build a
 fresh index for its merged output as it goes.

 In this design, xapian-compact would tell the rolling merge to continue
 until there's a single SSTable, which should then be clean of removed
 information, so for notmuch that would provide a way to clean up after
 handling some sensitive emails.  I'd guess this would be significantly
 cheaper than compacting a glass database too.

 So I think I'm unlikely to try to address this for glass at this point,
 but instead will review this once honey is fully functional.  I'm still
 happy to review patches to improve the situation for glass if anyone wants
 to work on that.

--
Ticket URL: <https://trac.xapian.org/ticket/742#comment:3>
Xapian <https://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list