[Xapian-tickets] [Xapian] #742: Xapian should provide a way to securely remove a document from the database
Xapian
nobody at xapian.org
Fri Dec 2 21:13:13 GMT 2016
#742: Xapian should provide a way to securely remove a document from the database
---------------------------+------------------
Reporter: dkg | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Other | Version:
Severity: normal | Keywords:
Blocked By: | Blocking:
Operating System: All |
---------------------------+------------------
currently, if i remove a document from a xapian index, the indexed terms
remain in the db, but are marked as part of the freelist.
This means that removal of a document is "insecure" in the sense that if
someone gained access to the index after message deletion, they could
recover information about the document by inspecting the contents of the
freelist.
There may be other traces of a document that are retained in the index as
well: for example, on IRC, olly mentioned:
> oh, there's one awkward thing in the backend stuff -- dividing keys get
created in the branch levels based on the leaf level keys around where the
block is split
Some of these fixes may be easier to do than others.
For example, it might be pretty easy to zero blocks when they're returned
to the freelist, but it might be harder to deal with the dividing keys.
It's still worth fixing the easy parts, even if some harder challenges
remain.
Another way to think about the problem is one of "index reproducibility"
-- if an index contains exactly the same set of documents as another
index, a byte-for-byte identical data store on disk is the ideal. Any
divergence from that ideal leaks some information about documents that
have been added to the database in the past, and then subseqently removed.
It's possible that any of these fixes incur a cost that some people are
reluctant to pay (e.g. they're not concerned about the confidentiality of
any of their indexed documents, or they're confident in the long-term
confidentiality of the index itself for other reasons). So it seems
likely that the feature needs to be optional. Whether the choice of
feature is opt-in or opt-out; and whether the choice is made done on a
per-deletion basis, or a per-database basis, or a per-xapian-session
basis, i don't know.
I'm happy to review API proposals if that'd be useful.
--
Ticket URL: <https://trac.xapian.org/ticket/742>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list