[GSoC] Questions about project Text-Extraction Libraries

Bruno Baruffaldi baruffaldibruno at gmail.com
Thu Mar 21 12:31:26 GMT 2019


Hello!

I have a few question related to the project Text-Extraction Libraries.

Firstly, I think that trying to isolate library bugs in subprocesses could
get to work, but I am not sure about how to handle deadlocks or infinite
loops. I feel that using a timer is the only way to deal with it but I
would like to know what you think about it.

Secondly, I have been reading the source code of ominex, but I cannot
figure out if it is possible to group all file formats under the same
interface. When indexing files, are all file formats treated in a similar
way, or are there special formats that require a different work (beyond the
use of external filters)?

To sum up, I want to know if ominex use multithreading for indexing files
or if you consider that it could be implemented to speed it up.

Cheers,
   Bruno Baruffaldi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20190321/0d4bbd0c/attachment.html>


More information about the Xapian-devel mailing list