[Xapian-discuss] Encrypted Database Files

Olly Betts olly at survex.com
Fri Feb 3 04:40:05 GMT 2006


On Wed, Jan 18, 2006 at 11:40:54AM +0000, James Aylett wrote:
> On Tue, Jan 17, 2006 at 09:09:02PM -0500, David Blewett wrote:
> > I'm considering using Xapian to index email messages in an IMAP server 
> > I'm writing. Is it possible to encrypt the databases stored on disk, so 
> > that someone cannot recover their contents?
> 
> You could encrypt the volume the database is stored on, and that's
> probably the best option IMHO.

I'd tend to agree.  This also neatly solves the same problem for the
stored mail messages too.

An alternative would be to patch the B-tree manager's functions which
read and write a block to decrypt and encrypt the contents.  That would
be a very simple patch if you have suitable crypto code already.  See
the functions read_block and write_block in backends/quartz/btree.cc for
Quartz databases.

> I wouldn't recommend it because the index terms would still be
> unencrypted, so while it isn't possible to get the actual email
> contents, you could get all the posting lists and hence the (stemmed
> version of the) words in the email.

And if you're storing positional information it's possible to
reconstruct something close to the original text (with no punctuation
and with terms stemmed if you're using stemming).

You could encrypt the terms themselves, but it's problematic.  You'll
still leak frequency information about the terms which can be used to
help break the encryption for example.  Also, I think a short plaintext
can open you up to brute-force attacks for some encryption algorithms.

Cheers,
    Olly



More information about the Xapian-discuss mailing list