[Xapian-tickets] [Xapian] #569: omindex man page -F text misleading
Xapian
nobody at xapian.org
Sat Oct 8 13:40:32 BST 2011
#569: omindex man page -F text misleading
--------------------+-------------------------------------------------------
Reporter: catkin | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Omega | Version: 1.2.5
Severity: normal | Blockedby:
Platform: All | Blocking:
--------------------+-------------------------------------------------------
From the omindex man page:
-F, --filter=TYPE:CMD
process files with MIME Content-Type TYPE using command CMD, which
should produce UTF-8 text on stdout e.g. -Fapplica‐tion/octet-
stream:'strings -n8
This could be understood to mean that omindex examines files to determine
their MIME type (I understood it that way) but from Olly's posting,
subject "Re: [Xapian-discuss] Tika 0.8 failure rates", date 5oct11:
By default, omindex currently uses a list of extension->MIME
content-type mappings, and only consults the magic library for
extensions it doesn't know. So any file with a .doc extension will be
considered as application/msword (unless you run omindex with
'--mime-type=doc:').
A note about this could be added to the omindex man page and referenced
from the -F and -M options descriptions.
--
Ticket URL: <http://trac.xapian.org/ticket/569>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list