[Xapian-discuss] Xapian::Document and threads

Jean-Francois Dockes jf at dockes.org
Mon May 5 07:12:15 BST 2014


Olly Betts writes:
 > On Sun, May 04, 2014 at 08:16:50PM +0200, Jean-Francois Dockes wrote:
 > > While investigating very infrequent crashes in the Recoll indexer, I have
 > > come to a very basic question: is it safe to pass a copy of a
 > > Xapian::Document from thread to thread (multiple threads queue documents,
 > > other thread updates the index) ?
 > > 
 > > I don't seem to get directly into trouble while doing this, but I don't see
 > > anything either in the RefCntr implementation which would explicitely make
 > > it thread-safe, so I am wondering. Maybe I'm just missing the obvious.
 > 
 > http://getting-started-with-xapian.readthedocs.org/en/latest/concepts/concurrency.html

This only warns about documents read from the index and sharing Database
references. Mine are just created from external data.

Only mentionning the Xapian::Database reference problem in the document
almost makes things worse as one is led to believe that it is the main or
only issue. Not my case anyway, as I think that the code is older than the
document.

 > > Of course, I could create the docs on the heap instead, and pass pointers,
 > > but is this needed ?
 > 
 > That might help if it's just the reference counting that's the issue,
 > but bear in mind that the underlying object gives you no thread-safety
 > guarantees either.

Yes, it's the reference counting which is the issue. As the objects are
clearly described as sharing data, I would never have thought of accessing
them from multiple threads.

My creating threads do not keep a reference to the objects, they give them
away to the indexing thread, so things appear safe, except that the
reference counting is not.

This is not necessarily obvious. For example, the g++ standard library
strings have similar multithreading issues, but as far as I can read the
code, they protect the reference counters.

I am not requesting thread-safe counters, and I understand that faking copy
semantics may be useful, but I think that unprotected reference counters
should be warned about. 

This is specially true because there is nothing that the library user can
do to mitigate the problem. Data accesses can be protected if you want to
share these objects, but it would be difficult with the counters, and the
only possibility is to use pointers, even in seemingly trivial cases.

This is a documentation issue, please, add a word in the header comments:
only pointers to Xapian::Document (and others) can hop threads.

I'm not sure if the base.h header comments speculating about the atomicity
of pointer copying are a fine example of English humour of if they were
placed to further confuse the investigator :)

Jf



More information about the Xapian-discuss mailing list