[Xapian-discuss] docx support
Frank Bruzzaniti
frank.bruzzaniti at gmail.com
Thu Jul 24 12:32:48 BST 2008
I was going to try flax if I couldn;t get this working on a linux box.
One question I have re omindex, when I run a crawl I see:
Indexing "/New Spreadsheet.ots" as
application/vnd.oasis.opendocument.spreadsheet-template ... updated.
I assume omindex uses OpenOffice to do the conversion.
I can open *.docx with OpenOffice and save as a *.txt how come you don;t
use open office for the bulk of your conversions?
Charlie Hull wrote:
> Olly Betts wrote:
>
>> On Thu, Jul 24, 2008 at 04:08:26AM +0930, Frank Bruzzaniti wrote:
>>
>>> Is office 2007 formats like docx supported?
>>>
>> Out of the box, not unless antiword supports it. The last update to the
>> debian packaged version was August 2006, so I suspect the answer is
>> "no".
>>
>>
> Just to say that we've looked at this for Flax and we're using the
> IFilter system, which since it is provided for Microsoft is pretty good
> with Microsoft formats. Of course, this only works on a Windows box, and
> needs COM, and it's not open source, so you'd probably need to parse
> into an intermediate format. There's a list of available IFilters on
> www.ifilter.org
>
> Cheers
>
> Charlie
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
More information about the Xapian-discuss
mailing list