Strange index consistency issue

Thu Jan 14 10:04:29 GMT 2016

Olly Betts writes:
 > On Sun, Jan 10, 2016 at 02:53:14AM +0000, Bob Cargill wrote:
 > > I am the recoll user mentioned in the first post above. I still have a copy
 > > of the (potentially) corrupted index and I did the requested testing. 
 > > 
 > > I ran delve -t '' ./xapiandb on the index and it returned a very long list
 > > of document IDs, separated by spaces. I than ran delve -t '' ./xapiandb |
 > > grep " 6 " and it returned nothing. 
 > > 
 > > So, document 6 was not in the list. 
 > > 
 > > There were other documents missing from the index as well, so I ran delve -t
 > > '' ./xapiandb | head -c 100 
 > > The first ID was 257, then it began sequentially from 356. Looks like the
 > > first approximately 350 document IDs are "missing." 
 > 
 > OK, that matches what I suspected was happening.
 > 
 > I've extended xapian-check so it should catch this case - you can get
 > the patch here ("Unified Diff" link at the bottom):
 > 
 > http://trac.xapian.org/changeset/ee3bc009d98a7cea8a2944135f38626e73bbcae3/git

Thanks Olly !

Bob: I'll ping you from the Recoll issue about running the new xapian-check
(I have built it).

 > > I will look into the bug you listed to see if it might be related. If there
 > > is anything else that I can do, please let me know. 
 > 
 > If that bug is not the cause, it would be good to get to the bottom of this -
 > the database shouldn't become corrupt like this.

I remembered something: I could only reproduce issue #645 with separate
read/write database objects, but this one is with recoll 1.21, which uses a
single object, so maybe a different problem. 

While a Xapian bug might be involved, there are many reasons why a Recoll
indexer can meet an abrupt end in the general case (not saying this is
the case here).
A pulled power cord would be the most radical example. Recoll usually does
not run in a datacenter...

In most cases, the data is replaceable without too much effort, so that
reliable detection of an issue is almost as good as assurance that it won't
occur. The latter seems very difficult to attain when running in an
uncontrolled environment.

There is one weird thing though, which is why, in this situation,
replace_document() appears to repeatedly accepts data which goes into a
black hole.

Cheers,

jf