[Xapian-discuss] docx support

Frank Bruzzaniti frank.bruzzaniti at gmail.com
Thu Jul 24 14:42:31 BST 2008


I added a mime type in omindex.cc but when I run it I get this:

Indexing "/Test.docx" as 
application/vnd.openxmlformats-officedocument.wordprocessingml.document 
... unknown MIME type - skipping

what other source files do I need to look at?

Olly Betts wrote:
> On Thu, Jul 24, 2008 at 02:51:26AM +0100, Olly Betts wrote:
>   
>> Rather than writing a full guide here, I'm going to write this up as a
>> wiki page, since that will be easier for others to find in the future.
>> I'll reply again when I'm done.
>>     
>
> http://trac.xapian.org/wiki/FAQ/OmegaNewFileFormat
>
>   
>>> Is there any option/procedure to add a new mime plugin?
>>> For example if you rename a docx .zip you can retrieve text from 
>>> document.xml
>>>       
>
> That's quite easy to do - you should be able to heavily base the code
> on that which handles OpenDocument format.  This extracts XML files
> from inside a Zip format file with extension .odt or similar and then
> does simple parsing to extract the document text.
>
> Cheers,
>     Olly
>   


More information about the Xapian-discuss mailing list