[Xapian-tickets] [Xapian] #568: omindex: fallback filter(s)?

Xapian nobody at xapian.org
Fri Sep 23 04:43:13 BST 2022


#568: omindex: fallback filter(s)?
-------------------------+-------------------------------
 Reporter:  Charles      |             Owner:  Olly Betts
     Type:  enhancement  |            Status:  new
 Priority:  normal       |         Milestone:  1.5.0
Component:  Omega        |           Version:  git master
 Severity:  normal       |        Resolution:
 Keywords:               |        Blocked By:
 Blocking:               |  Operating System:  All
-------------------------+-------------------------------
Changes (by Olly Betts):

 * version:  SVN trunk => git master
 * milestone:  1.4.x => 1.5.0

Comment:

 Currently specifying `--filter` overrides any previously set filter for
 the specified mime-type, so changing that to instead create a chain of
 filters is potentially incompatible with existing usage.

 We could clearly come up with a way to specify a list of filters though.

 Meanwhile a simple version of this can be achieved with a wrapper script
 which tries tools in turn until one succeeds, e.g.:

 {{{
 #!/bin/sh
 set -e
 antiword "$@" || catdoc "$@"
 }}}

 The main downside of this approach is probably the extractors have to all
 produce the same format (plaintext, HTML or SVG), all in the same
 character encoding, and all output to stdout or a temporary file.  If
 omindex handled this it could mix and match.  With git master you could
 also mix the new plugin extractors with command line extractors.

 If output is to stdout, the wrapper script also potentially risks a
 failing tool producing partial output on stdout before failing.  If a
 temporary file is used the subsequent tool should just overwrite any
 partial output.
-- 
Ticket URL: <https://trac.xapian.org/ticket/568#comment:3>
Xapian <https://xapian.org/>
Xapian


More information about the Xapian-tickets mailing list