[Xapian-discuss] Omindex Filters

James Aylett james-xapian at tartarus.org
Thu Sep 18 11:46:45 BST 2008


On Thu, Sep 18, 2008 at 04:11:15AM +0100, Olly Betts wrote:

> > How about XML for the output so we can incorporate any additional
> > meta-data.
> 
> That's essentially why Recoll's filters convert to HTML.  The main
> issue is that it adds the overhead of the external script converting
> to XML and then omindex parsing the XML to get back to the plain
> text.

I'm -1 on XML as an intermediate format, and -2 on HTML. I'm currently
tending towards the idea that we should initially just implement text,
since that will solve a lot of problems at a reasonable level (and
people can still use scriptindex), and then we can think about more
complex things later. (*Possibly* we could have the filter mechanism
use any of the internal parsers, meaning if you really wanted to
convert to HTML and parse that for extra metadata, you could.)

> It seems that you just have to use $TMPDIR or /tmp and hope that the
> system is sanely configured.

Trust the OS to manage $TMPDIR correctly. On any well-tuned decent OS
it'll be efficient. (On many it'll actually be a tmpfs anyway.)

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list