errors on rebuild

Olly Betts olly at survex.com
Tue Feb 28 23:40:07 GMT 2017


On Mon, Feb 27, 2017 at 10:29:46AM -0800, Ryan Cross wrote:
> I am trying to rebuild an index of 2+ million documents and have not been successful.  I am running 
> 
> Python 2.7
> Django 1.7
> Haystack 2.1.1
> Xapian 1.2.21
> 
> The index rebuild command I’m using is: django-admin.py rebuild_index --noinput --batch-size=100000
> The rebuild completes but an immediate xapian-check returns this error:
[...]
> Trying the latest stable version, Xapian 1.4.3, it fails during the rebuild:
> 
> All documents removed.
> Indexing 2233651 messages
> Traceback (most recent call last):
>> 
>   File "/a/mailarch/current/haystack/management/commands/update_index.py", line 221, in handle_label
>     self.update_backend(label, using)
>   File "/a/mailarch/current/haystack/management/commands/update_index.py", line 266, in update_backend
>     do_update(backend, index, qs, start, end, total, self.verbosity)
>   File "/a/mailarch/current/haystack/management/commands/update_index.py", line 89, in do_update
>     backend.update(index, current_qs)
>   File "/a/mailarch/current/haystack/backends/xapian_backend.py", line 286, in update
>     database.close()

What's the version of xapian-haystack?  There's not a database.close() anywhere
near line 286 in git master:

https://github.com/notanumber/xapian-haystack/blob/master/xapian_backend.py#L286

> xapian.DatabaseCorruptError: Expected block 615203 to be level 0, not 1
> docdata:
> blocksize=8K items=380000 firstunused=21983 revision=38 levels=2 root=21410

Is that the full output of xapian-check?

> Any suggestions for how I could get more information to troubleshoot this
> failure would be greatly appreciated.

Is the data to reproduce this something you can make available?

I'd stick with Xapian 1.4.3 for trying to narrow this down (if it's a Xapian
bug we can backport the fix once identified).

The error message means that a block which was expected to be at the leaf level
was actually marked as being one level above, which suggests either there's an
obscure bug in the backend code which only manifests in rare circumstances, or
something is corrupting data (could be in memory or on disk).

Since this happens with both 1.2.x and 1.4.x I would tend to suspect it's
something external (rather than a bug in Xapian) as the default backends in 1.2
and 1.4 have some significant differences.  It's certainly possible it's a
Xapian bug, but if so I would expect we'd be seeing other reports, though maybe
we've actually had one or two and thought them due to #675, which was fixed in
1.2.21 (however nobody's yet said "no, still seeing that"):

https://trac.xapian.org/ticket/675

You could look at block 615203 of docdata.glass to see what it looks like -
that might offer clues:

xxd -g1 -seek $((615203*8192)) -len 8192 docdata.glass

It'd also be good to eliminate possible system issues - e.g. check the disk is
healthy (check the SMART status, run fsck on it), run a RAM test (distros often
provide a way to run memtest86+ or similar from the boot menu).

Cheers,
    Olly



More information about the Xapian-discuss mailing list