How to make database build threaded?

Jean-Francois Dockes jf at dockes.org
Fri Sep 14 17:22:11 BST 2018


Franco Martelli writes:
 > On 14/09/2018 at 09:30, Jean-Francois Dockes wrote:
 > > Hi,
 > > 
 > > You may be interested by how Recoll does it:
 > > 
 > > https://www.lesbonscomptes.com/recoll/idxthreads/threadingRecoll.html
 > > 
 > > A few things in the document are slightly obsolete (esp. the last
 > > paragraph: recollindex now does use vfork()), but it's overall quite close
 > > to how the current indexer works.
 > > 
 > > jfd
 > > 
 > Thank for your answer, briefly it's No:
 > 
 > > The Xapian library index updating code is not designed for
 > > multi-threading and must stay protected from multiple accesses. 

Yes, obviously the Xapian part stays single-threaded.

 > just for evaluation purpose could you provide me some links to the code
 > about how Recoll parallelizes "Data extraction and Conversion" and "Term
 > generation".

The code repository is here:  https://opensourceprojects.eu/p/recoll1/code/

Or else download a tar release from here: https://www.lesbonscomptes.com/recoll/download.html

The extraction code is mostly under the "internfile" directory.

Look in index/fsindexer.cpp and rcldb/rcldb.cpp for the job queues.

Get in touch directly with me if you have questions, this is not really
Xapian-related (once you've realized that the db work will stay
single-threaded).

jfd



More information about the Xapian-discuss mailing list