Strange index consistency issue

Bob Cargill rsc.bioeng at gmail.com
Thu Jan 14 23:57:03 GMT 2016


Olly Betts <olly <at> survex.com> writes:

> 
> On Thu, Jan 14, 2016 at 11:04:29AM +0100, Jean-Francois Dockes wrote:
> > Olly Betts writes:
> >  > On Sun, Jan 10, 2016 at 02:53:14AM +0000, Bob Cargill wrote:
> >  > > I will look into the bug you listed to see if it might be related.
If there
> >  > > is anything else that I can do, please let me know. 
> >  > 
> >  > If that bug is not the cause, it would be good to get to the bottom
of this -
> >  > the database shouldn't become corrupt like this.
> > 
> > I remembered something: I could only reproduce issue #645 with separate
> > read/write database objects, but this one is with recoll 1.21, which uses a
> > single object, so maybe a different problem. 
> 
> The underlying bug for #645 was that cursors weren't getting rebuilt in
> some situations where they needed to be, and could end up with bad data
> in, and that bad data could be stale data.  So it's plausible a write
> might go to the wrong block, which could explain "lost" data like we
> have here.
> 
> It could easily be a different problem, but testing with the latest
> 1.2.x would be useful to make sure we aren't trying to track down a bug
> we've already fixed.
> 

I don't see the most recent xapian in the Ubuntu 14.04 repositories. I have
1.2.16 from the Ubuntu Trusty repositories. The PPA doesn't list a Trusty
version. Is 1.2.22 easy to build? I have not built anything on linux (sorry
for the naive question. I've built under windows in the distance past, so
I'm familiar with compilers, source files, and make).  

> 
> It may not matter for recoll, but more generally we don't want Xapian
> databases getting corrupt.  And we do aim to survive power failures,
> kernel panics, etc - achieving that in all cases is rather hard, but I
> don't think that's a reason to drop it as an aim.
> 
> Examples of corruption that can be reproduced (even if it's not entirely
> on demand) are very useful - if you can see the corruption happen it's
> a lot easier to work out what is going wrong than if you just see the
> aftermath.

I have the log file from recoll from (almost) the beginning. I'm not sure if
I can cull out the source of the errors, but I will take a look at what was
happening with recoll when the errors began. 

> 
> > There is one weird thing though, which is why, in this situation,
> > replace_document() appears to repeatedly accepts data which goes into a
> > black hole.
> 
> Are you replacing the document with the same data?

Yes. All of the files that I mentioned earlier (~350) have not been changed
in years.

> 
> If so, I think what happens is that it looks in the termlist table to
> see if the document exists.  It does, so it compares the terms and sees
> they are the same, and decides there's nothing to do.
> 
> It never looks at the document length list, so doesn't see that is
> damaged.
> 

This is likely what happened. Recoll should have provided the same terms for
the file and the same length. 








More information about the Xapian-discuss mailing list