[Xapian-discuss] docx support
Frank Bruzzaniti
frank.bruzzaniti at gmail.com
Thu Jul 24 14:42:31 BST 2008
I added a mime type in omindex.cc but when I run it I get this:
Indexing "/Test.docx" as
application/vnd.openxmlformats-officedocument.wordprocessingml.document
... unknown MIME type - skipping
what other source files do I need to look at?
Olly Betts wrote:
> On Thu, Jul 24, 2008 at 02:51:26AM +0100, Olly Betts wrote:
>
>> Rather than writing a full guide here, I'm going to write this up as a
>> wiki page, since that will be easier for others to find in the future.
>> I'll reply again when I'm done.
>>
>
> http://trac.xapian.org/wiki/FAQ/OmegaNewFileFormat
>
>
>>> Is there any option/procedure to add a new mime plugin?
>>> For example if you rename a docx .zip you can retrieve text from
>>> document.xml
>>>
>
> That's quite easy to do - you should be able to heavily base the code
> on that which handles OpenDocument format. This extracts XML files
> from inside a Zip format file with extension .odt or similar and then
> does simple parsing to extract the document text.
>
> Cheers,
> Olly
>
More information about the Xapian-discuss
mailing list