notmuch: Xapian exception during database creation
Olly Betts
olly at survex.com
Fri Dec 29 22:23:01 GMT 2017
On Fri, Dec 29, 2017 at 03:00:47PM +0000, David Edmondson wrote:
> Running notmuch from git on Debian testing[1] with the mail and database
> sitting on a ZFS filesystem, adding mail to a new database:
>
> > agrajag-testing ~/s/notmuch % ./notmuch new
> > Found 605510 total files (that's not much mail).
> > add_file: A Xapian exception occurred36m 37s remaining).
> > A Xapian exception occurred adding message: Unexpected end of posting list for 'G0000000000014364'.
Each posting list chunk has a "last chunk" flag - this exception means
this flag appears not to be set for the final chunk for the term
'G0000000000014364'.
The update logic for this flag is somewhat complex, but it looks OK to
me. This code hasn't changed recently either, so I suspect the problem
is actually elsewhere.
> > Processed 137296 total files in 1h 21m 17s (28 files/sec.).
> > Added 135950 new messages to the database.
> > Note: A fatal error was encountered: A Xapian exception occurred
> > agrajag-testing ~/s/notmuch %
>
> This happens consistently at this point in the process - if I remove the
> database and start again it will fail at the same place.
>
> After the failure the database appears to be okay:
>
> > agrajag-testing ~/s/notmuch % xapian-check ~/Maildir/.notmuch/xapian
> > docdata:
> > blocksize=8K items=0 firstunused=1 revision=1 levels=0 root=(faked)
> > void B-tree checked okay
> > docdata table structure checked OK
> >
> > termlist:
> > blocksize=8K items=0 firstunused=2 revision=1 levels=0 root=(faked)
> > void B-tree checked okay
> > termlist table structure checked OK
> >
> > postlist:
> > blocksize=8K items=2 firstunused=1 revision=1 levels=0 root=0
> > B-tree checked okay
> > postlist table structure checked OK
> >
> > position:
> > blocksize=8K items=0 firstunused=1 revision=1 levels=0 root=(faked)
> > void B-tree checked okay
> > position table structure checked OK
This seems to be for an almost empty database (2 items in the postlist
table and nothing anywhere else) which doesn't really seem consistent
with the amount of data notmuch reports as having processed. Are you
setting XAPIAN_FLUSH_THRESHOLD very high?
You can look at the low level entries in the postlist table with:
xapian-inspect ~/Maildir/.notmuch/xapian/postlist.glass
(You'll need to build xapian-core from source to get xapian-inspect,
as it's really a tool for developers).
I'd guess the two entries are user metadata entries notmuch has added
(keys starting \x00\xc0) - otherwise I'd expect entries in other tables
too.
> Unfortunately I can't share the dataset, as it contains mail belonging
> to my employer :-(
>
> Any suggestions on how to go about debugging this?
Building xapian-core configured with --enable-assertions is worth
trying, as it might catch the problem earlier.
You could also try running it under valgrind, though that may take some
time if it normally takes 1h21m.
If that doesn't help, either we need to find a way to reproduce this
which isn't using sensitive data, or add more consistency checks to the
code to try to pin down what's going on.
Cheers,
Olly
More information about the Xapian-discuss
mailing list