[Xapian-discuss] docx support
Frank Bruzzaniti
frank.bruzzaniti at gmail.com
Thu Jul 24 15:46:21 BST 2008
Yay it works.
I added
mime_map["docx"] =
"application/vnd.openxmlformats-officedocument.wordprocessingml.document";
//Word 2007
// Start: Word 2007 .docx
} else if (startswith(mimetype,
"application/vnd.openxmlformats-officedocument.wordprocessingml."))
{
// Inspired by http://mjr.towers.org.uk/comp/sxw2text
string safefile = shell_protect(file);
string cmd = "unzip -p " + safefile + " word/document.xml";
try {
XmlParser xmlparser;
xmlparser.parse_html(stdout_to_string(cmd));
dump = xmlparser.dump;
} catch (ReadError) {
cout << "\"" << cmd << "\" failed - skipping\n";
return;
}
// End: Word 2007 .docx
Olly Betts wrote:
> On Thu, Jul 24, 2008 at 11:12:31PM +0930, Frank Bruzzaniti wrote:
>
>> I added a mime type in omindex.cc but when I run it I get this:
>>
>> Indexing "/Test.docx" as
>> application/vnd.openxmlformats-officedocument.wordprocessingml.document
>> ... unknown MIME type - skipping
>>
>> what other source files do I need to look at?
>>
>
> None - this is all omindex.cc.
>
> It sounds like you've added it to mime_map so that .docx is converted to
> that mime-type, but not added an "else if" case to actually handle the
> new mime-type. The new FAQ entry covers that too.
>
> If that's not it, send a patch of your change (diff -u format).
>
> Cheers,
> Olly
>
More information about the Xapian-discuss
mailing list