errors on rebuild
Ryan Cross
rcross at amsl.com
Sat Mar 25 23:36:25 GMT 2017
Hi Olly,
After upgrades my stack is now:
Python 2.7
Django 1.8
Haystack 2.6.0
Xapian 1.4.3. (latest xapian haystack backend with some modifications)
Using the same rebuild command as below but with —batch-size=50000
The issue has now become one of performance. I am indexing 2.2 million documents. Using delve I can see that performance starts off at about 100,000 records an hour. This is consistent with the roughly 24 hour rebuild time I was experiencing with Xapian 1.2.21 (chert). However, after 75 hours of build time, the index is about 75% complete and records are processing at a rate of 10,000/hr. The index is 51GB is size, 30GB is position.glass.
Here is a one minute strace summary
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
63.97 1.272902 13 100240 pread
33.71 0.670733 14 48175 pwrite
0.57 0.011253 8 1484 read
0.45 0.008938 6 1524 fstat
0.36 0.007098 6 1270 lseek
0.25 0.004988 20 254 open
0.18 0.003544 14 254 recvfrom
0.11 0.002148 8 254 sendto
0.10 0.002056 8 254 close
0.10 0.001949 8 254 poll
0.07 0.001429 11 127 munmap
0.06 0.001111 9 127 mmap
0.04 0.000802 6 127 127 ioctl
0.04 0.000773 6 127 gettimeofday
------ ----------- ----------- --------- --------- ----------------
100.00 1.989724 154471 127 total
This is ten documents with number of terms in the 10s - low100s range. Is there a way I can tune for better performance?
Thanks,
Ryan
> On Mar 2, 2017, at 4:48 PM, Ryan Cross <rcross at amsl.com> wrote:
>
> Hi Olly,
>
> Thanks for the detailed response. I hadn’t realized there was a new xapian haystack backend. I’m going to try that but I have some upgrades to do first. Django 1.8, etc.
>
> Thanks,
> Ryan
>
>> On Feb 28, 2017, at 3:40 PM, Olly Betts <olly at survex.com> wrote:
>>
>> On Mon, Feb 27, 2017 at 10:29:46AM -0800, Ryan Cross wrote:
>>> I am trying to rebuild an index of 2+ million documents and have not been successful. I am running
>>>
>>> Python 2.7
>>> Django 1.7
>>> Haystack 2.1.1
>>> Xapian 1.2.21
>>>
>>> The index rebuild command I’m using is: django-admin.py rebuild_index --noinput --batch-size=100000
>>> The rebuild completes but an immediate xapian-check returns this error:
>> [...]
>>> Trying the latest stable version, Xapian 1.4.3, it fails during the rebuild:
>>>
>>> All documents removed.
>>> Indexing 2233651 messages
>>> Traceback (most recent call last):
>>> …
>>>
>>> File "/a/mailarch/current/haystack/management/commands/update_index.py", line 221, in handle_label
>>> self.update_backend(label, using)
>>> File "/a/mailarch/current/haystack/management/commands/update_index.py", line 266, in update_backend
>>> do_update(backend, index, qs, start, end, total, self.verbosity)
>>> File "/a/mailarch/current/haystack/management/commands/update_index.py", line 89, in do_update
>>> backend.update(index, current_qs)
>>> File "/a/mailarch/current/haystack/backends/xapian_backend.py", line 286, in update
>>> database.close()
>>
>> What's the version of xapian-haystack? There's not a database.close() anywhere
>> near line 286 in git master:
>>
>> https://github.com/notanumber/xapian-haystack/blob/master/xapian_backend.py#L286
>>
>>> xapian.DatabaseCorruptError: Expected block 615203 to be level 0, not 1
>>> docdata:
>>> blocksize=8K items=380000 firstunused=21983 revision=38 levels=2 root=21410
>>
>> Is that the full output of xapian-check?
>>
>>> Any suggestions for how I could get more information to troubleshoot this
>>> failure would be greatly appreciated.
>>
>> Is the data to reproduce this something you can make available?
>>
>> I'd stick with Xapian 1.4.3 for trying to narrow this down (if it's a Xapian
>> bug we can backport the fix once identified).
>>
>> The error message means that a block which was expected to be at the leaf level
>> was actually marked as being one level above, which suggests either there's an
>> obscure bug in the backend code which only manifests in rare circumstances, or
>> something is corrupting data (could be in memory or on disk).
>>
>> Since this happens with both 1.2.x and 1.4.x I would tend to suspect it's
>> something external (rather than a bug in Xapian) as the default backends in 1.2
>> and 1.4 have some significant differences. It's certainly possible it's a
>> Xapian bug, but if so I would expect we'd be seeing other reports, though maybe
>> we've actually had one or two and thought them due to #675, which was fixed in
>> 1.2.21 (however nobody's yet said "no, still seeing that"):
>>
>> https://trac.xapian.org/ticket/675
>>
>> You could look at block 615203 of docdata.glass to see what it looks like -
>> that might offer clues:
>>
>> xxd -g1 -seek $((615203*8192)) -len 8192 docdata.glass
>>
>> It'd also be good to eliminate possible system issues - e.g. check the disk is
>> healthy (check the SMART status, run fsck on it), run a RAM test (distros often
>> provide a way to run memtest86+ or similar from the boot menu).
>>
>> Cheers,
>> Olly
>
More information about the Xapian-discuss
mailing list