[Xapian-discuss] patch proposal: omindex library or daemon

Liam xapian at networkimprov.net
Sun Oct 23 20:44:49 BST 2011


On Sun, Oct 23, 2011 at 11:40 AM, James Aylett <james-xapian at tartarus.org>wrote:

> On 17 Oct 2011, at 22:25, Liam wrote:
>
> > For apps which re/index files frequently and need format conversion, I'd
> > like to propose a patch for one of...
> >
> > Omindex library (thread safe)
> > Omindex daemon mode
>
> Omindex doesn't do a huge amount; there's a fairly simple system for
> figuring out which external helper to use for a given file (which is mostly
> libmagic and a MIME type to helper map), and there's some infrastructure for
> defending against runaway helpers which would probably want to be more
> flexible or even totally different in a library context, and wouldn't even
> be quite the same in a daemon context. The rest is a directory tree walker,
> a couple of internal handlers, and a pretty small amount of code that
> directly interfaces with Xapian.
>
> I guess I'm not entirely sure which bits of omindex you think are most
> valuable to pull into other systems. Given the niche use of it, if this was
> interesting I'd probably support a refactor and intermediate build step that
> generates a library used to create omindex, rather than installing another
> dynamic library for all users who won't need this. That would enable you to
> reuse bits of omindex as needed.
>

Hallo James, thanks for your input.

To my mind the core of omindex is the file format converter, and code
inputting results to the db. I don't need the directory walker myself (and
filesystem-based collections are decreasingly common in online contexts).
Perhaps the right approach is a new class derived from WritableDatabase,
which you can hand filenames too, with some options?

And yes, a static lib which gets pulled into the omindex executable sounds
right.


> Certainly we'd been happy to see work done to make Xapian available to Node
> users — that would be awesome. I know very little about Node & V8, but I
> assume we'd be looking at object & function templates in V8 to gain access
> to Xapian, and then some Node helpers to make it easier for developers to
> use in an idiomatic fashion


I'm banging away as we speak on a Node module which enables the search
example in the quickstart guide. Will post to github as soon as it's
working. (It's an annoyingly large amount of glue code.)

BTW, if there's a downloadable database somewhere (for 1.0.x, what came for
Ubuntu 10.04) that'd be a help for testing.


More information about the Xapian-discuss mailing list