[Xapian-discuss] patch proposal: omindex library or daemon

Liam xapian at networkimprov.net
Thu Nov 3 02:21:40 GMT 2011


On Wed, Nov 2, 2011 at 6:18 PM, Olly Betts <olly at survex.com> wrote:

>
> But overall, I think it's probably simpler if you just work on whatever
> you want to achieve, and we can decide what to do about it once there's
> actually something to look at.  A long email discussion about what the
> API(s) should be is all very lovely, but a (mostly) working
> implementation of them gives much better insights into what actually
> works.
>

Thanks for your feedback. I'm afraid I clouded the issues somewhat because
I was coming up to speed on how everything works; apologies.

All I need is to copy/paste the mime-file conversion logic into a separate,
non-shared library, with a single API call. This is the code in
index_mimetype() that takes a filename and produces a set of plain-text
strings (author, title, sample, keywords, dump, md5), plus the mime_map
defined in main(). Also, the makefile entry for this library would
reference the external sources it depends on.

Enabling zip-file unpacking in the mime converter in future would be great;
I don't think this change would conflict with that. For a group of
documents in an archive file, the library function could return a table of
results instead of a single set of strings.

I will have some questions as I'm working on this... I'd like guidance on
how to port DirectoryIterator::file_to_string() to the new library, since
it wouldn't use DirectoryIterator.

Assuming that sounds rational to you, I'll get started!


More information about the Xapian-discuss mailing list