[Xapian-devel] buffered tables, sessions, and transactions
Richard Boulton
richard at lemurconsulting.com
Wed May 19 17:31:04 BST 2004
Olly Betts wrote:
> I've noticed a slight wrinkle - currently if a database is destroyed
> with an active transaction, cancel_transaction is called. Without
> this method, if an error is thrown part way through a transaction,
> either we apply the whole lot, or lose everything since the last call to
> flush() (explicit or implicit). That could be several transactions, so
> at least this preserves atomicity better than just flushing.
I think the possible situations are:
1) Users don't require any guarantees other than that the database
remains in a consistent state, since they can easily replay all their
data if an error occurs (and use a unique ID to check if each item
was already indexed).
2) Users want to be able to call something ("flush()") to ensure that
changes up to a certain point are not lost from the database if an
error occurs in future.
3) Users want to be able to ensure that a group of modifications (eg, an
insert and delete pair, or something more complex) are atomically
added to the database. If an error occurs part way through the
group, the entire group must be discarded (possibly along with other
modifications, if a flush() wasn't called before entering the group).
Once the group is complete, it may still get discarded if a error
occurs before flush() has been called.
4) As (3), but users want to be able to change their mind part-way
through a group of modifications (perhaps due to an error outside
Xapian) and cancel the whole group.
It seems to me that the use of the word "transaction" has various
connotations which we don't necessarily wish to implement. In
particular, to say that a transaction is complete sounds to me as if the
transaction should have been written to disk.
How about removing the transaction methods and implementing:
begin_group() - begins a group of modifications (no flush before
start).
end_group() - ends a group of modifications, but doesn't flush.
cancel_group() - cancel a group of modifications (and anything else
which hasn't been flushed (by flush() or autoflush()).
Autoflush will never happen during a group, and an explicit flush()
called during a group will report an error.
Does this make sense, or have I missed something? I think this would be
simple to implement (but am a little out of touch with the relevant part
of the code, so may be missing a problem).
> Perhaps we should *always* require a call to flush() (or a new close()
> method) before a database is destroyed? At present, any errors thrown
> by the implicit flush() in the destructor are caught and ignored, which
> isn't ideal at all.
I don't like this idea. Better would be to recommend that a flush() is
called before destroying a database - but if one hasn't been then call
flush in the destructor and ignore errors. If users don't care (eg,
situation 1, above), they don't need to flush(), but if they do care
they will call flush and receive error reports.
--
Richard
More information about the Xapian-devel
mailing list