[Xapian-discuss] patch proposal: omindex library or daemon

Olly Betts olly at survex.com
Thu Nov 3 01:18:51 GMT 2011


On Sun, Oct 30, 2011 at 07:40:39PM -0700, Liam wrote:
> On Wed, Oct 26, 2011 at 2:05 AM, James Aylett <james-xapian at tartarus.org>wrote:
> 
> > I was talking to Olly about this before the flight yesterday, and he has a
> > bunch of changes on top of the current system that he's done for a client
> > but not yet had a chance to merge. I imagine he'll surface at some point in
> > the next couple of days once he's got back on top of things again, although
> > obviously no promises how long it'll take him to extract those changes from
> > any confidential work.
> 
> Checking back on this... If pending omindex changes don't impact the
> mime-conversion code, we could create the mime-converter library and defer
> revving omindex to use it.

OK, I've finally managed to catch up on this monster thread.  Sorry for
taking so long.

The stack of changes I have touch pretty much everything in the parts of
omindex you're talking about unfortunately.

I've already merged some of the easier stuff, but there's a lot more
which needs tidying up and hard-wired local details factoring out into
configurable settings, etc.  I can't just make the current patches
public right now, but once I'm back from my travels I can take a look
and see what's what.

I don't really want to block useful work on Omega for months because of
this, but I would really rather avoid creating more conflicts than
necessary.  So refactoring for the sake of refactoring is probably
better avoided.

A few random points about things mentioned in the thread so far:

* You can't really usefully subclass most Xapian classes (except for the
  ones which are explicitly intended to be subclassed, obviously)
  because they are mostly reference-counted handles with non-virtual
  methods.  So if you subclass Xapian::Document then pass the object
  into Xapian, it'll only care about the internal reference-counted
  class - your subclassed parts will effectively just get sliced off.
  (Making the methods virtual wouldn't really help...)

* There are certainly things in Omega which I'd like to see available
  as part of the API (highlighting matching terms is the major one,
  which is easy to do badly, but rather hard to do well, and Omega
  does it fairly well, taking stemming into account for example).

* Personally I'm not really sure I want to commit to maintaining an
  public API to everything Omega does.  The internal interfaces would
  be limiting if we couldn't change them at will - for example, adding
  support for indexing files inside a zip file attached to an email
  would require significant changes.

But overall, I think it's probably simpler if you just work on whatever
you want to achieve, and we can decide what to do about it once there's
actually something to look at.  A long email discussion about what the
API(s) should be is all very lovely, but a (mostly) working
implementation of them gives much better insights into what actually
works.

Cheers,
    Olly



More information about the Xapian-discuss mailing list