GSoC 2016: Text-Extraction Libraries in Omega
James Aylett
james-xapian at tartarus.org
Wed Mar 9 17:06:40 GMT 2016
On Mon, Mar 07, 2016 at 03:31:34PM -0800, Philip Chung wrote:
> I've been looking at the project-ideas list and I'm interested in making
> Omega use libraries instead of external programs.
>
> Right now I'm trying to get Olly's patch that was linked there to apply
> to the current master. From that point I would see if I could generalize
> this to other types of extraction.
>
> I was thinking of different executables for each type of extraction. Is
> this a good way to go, or is there a better way to go about it?
Hi, Philip. At the moment we use different executables for each type;
and we'll want to continue doing so. The project is more about using
libraries in preference, so we don't have to invoke an external
program for common file formats -- which should improve indexing
speed.
I'm not sure how you propose generalising use of a library for
extraction; how would a user configure omindex to know how to call the
relevant library functions?
J
--
James Aylett, occasional trouble-maker
xapian.org
More information about the Xapian-devel
mailing list