[Xapian-tickets] [Xapian] #317: Database corruption after disk-full error
Xapian
nobody at xapian.org
Thu Dec 18 09:07:38 GMT 2008
#317: Database corruption after disk-full error
---------------------------+------------------------------------------------
Reporter: richard | Owner: richard
Type: defect | Status: new
Priority: normal | Milestone: 1.0.10
Component: Backend-Flint | Version: 1.0.7
Severity: normal | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
---------------------------+------------------------------------------------
Comment(by richard):
Here comes another long comment...
In the second run (ie, the one which ends in a segmentation fault), the
first exception raised is (according to a run under gdb with "catch throw"
set):
{{{
#0 0xb7a13e05 in __cxa_throw () from /usr/lib/libstdc++.so.6
#1 0xb7b0bf02 in flint_io_write (fd=12, p=0x81cbd5c
"\b\005\200@\003\002�\004�7\223\"", n=563)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_io.cc:57
#2 0xb7af5c51 in FlintTable_base::write_to_file (this=0x81de684,
filename=@0xbfd82b30, base_letter=65 'A', tablename=@0xbfd82b28,
changes_fd=-1, changes_tail=0x0) at
/home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_btreebase.cc:333
#3 0xb7b23a89 in FlintTable::commit (this=0x81de658, revision=8,
changes_fd=-1, changes_tail=0x0)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_table.cc:1790
#4 0xb7b00ce2 in FlintDatabase::set_revision_number (this=0x81de630,
new_revision=8)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:500
#5 0xb7b02000 in FlintDatabase::apply (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:786
#6 0xb7b03956 in FlintWritableDatabase::flush (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:1305
#7 0xb7a809a6 in Xapian::WritableDatabase::flush (this=0x81f9f50)
at /home/richard/private/Working/xapian/working/xapian-
core/api/omdatabase.cc:687
#8 0xb7c39bd7 in _wrap_WritableDatabase_flush (args=0xb7da814c) at
modern/xapian_wrap.cc:26857
#9 0x0805cb97 in PyObject_Call ()
#10 0x080c7aa7 in PyEval_EvalFrameEx ()
#11 0x080cb1f7 in PyEval_EvalCodeEx ()
#12 0x080cb347 in PyEval_EvalCode ()
#13 0x080ea818 in PyRun_FileExFlags ()
#14 0x080eaab9 in PyRun_SimpleFileExFlags ()
#15 0x08059335 in Py_Main ()
#16 0x080587f2 in main ()
}}}
The second exception is:
{{{
#0 0xb7a13e05 in __cxa_throw () from /usr/lib/libstdc++.so.6
#1 0xb7b2348b in FlintTable::cancel (this=0x81de658)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_table.cc:1876
#2 0xb7afa5e1 in FlintDatabase::cancel (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:800
#3 0xb7b064fe in FlintWritableDatabase::cancel (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:1732
#4 0xb7b01b8d in FlintDatabase::modifications_failed (this=0x81de630,
old_revision=7, new_revision=8, msg=@0xbfd82c60)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:747
#5 0xb7b020f0 in FlintDatabase::apply (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:788
#6 0xb7b03956 in FlintWritableDatabase::flush (this=0x81de630)
at /home/richard/private/Working/xapian/working/xapian-
core/backends/flint/flint_database.cc:1305
#7 0xb7a809a6 in Xapian::WritableDatabase::flush (this=0x81f9f50)
at /home/richard/private/Working/xapian/working/xapian-
core/api/omdatabase.cc:687
#8 0xb7c39bd7 in _wrap_WritableDatabase_flush (args=0xb7da814c) at
modern/xapian_wrap.cc:26857
#9 0x0805cb97 in PyObject_Call ()
#10 0x080c7aa7 in PyEval_EvalFrameEx ()
#11 0x080cb1f7 in PyEval_EvalCodeEx ()
#12 0x080cb347 in PyEval_EvalCode ()
#13 0x080ea818 in PyRun_FileExFlags ()
#14 0x080eaab9 in PyRun_SimpleFileExFlags ()
#15 0x08059335 in Py_Main ()
#16 0x080587f2 in main ()
}}}
This means that the second error is due to cancel being unable to read the
alternate base file, which doesn't exist.
I think the problem is that FlintTable::commit() sets the "base_letter"
member to point to the alternate base before failing (and also sets
various properties of the FlintBase object), and doesn't tidy itself up on
exception. Therefore, after an exception, the FlintTable object is left
in an inconsistent state, such that cancel() fails.
I think a good fix might be to respond to a failure in commit() by closing
and reopening the tables, to ensure that they're in a consistent state.
This could probably be done most neatly by a try-catch-throw around most
of FlintTable::commit(), which calls FlintTable::close() on any error.
commit() is only called by FlintDatabase::set_revision_number(). Which in
turn is called in several places:
- FlintDatabase constructor. If a failure occurs here, the exception is
just propagated, and construction fails, so we don't need to do any
cleanup.
- FlintDatabase::apply(). If a failure occurs here,
modifications_failed() is called, which calls cancel() and then
open_tables() (so tables closed by failure of commit() would be reopened
here).
- FlintDatabase::modifications_failed(). If a failure occurs here, we'd
probably be best to respond by putting the database into a "hard-closed"
state. Olly suggested making a FlintTable::close() alternative which sets
the handle to -2, and making that be a close state which we don't
automatically open the tables from (instead, raise an exception). I think
this would be a good response, in the circumstances.
--
Ticket URL: <http://trac.xapian.org/ticket/317#comment:4>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list