[Xapian-tickets] [Xapian] #383: Patch to replace antiword with abiword

Xapian nobody at xapian.org
Fri Jun 12 16:58:44 BST 2009


#383: Patch to replace antiword with abiword
-------------------------+--------------------------------------------------
 Reporter:  frankjb      |       Owner:  olly
     Type:  enhancement  |      Status:  new 
 Priority:  normal       |   Milestone:      
Component:  Other        |     Version:      
 Severity:  normal       |    Keywords:      
Blockedby:               |    Platform:  All 
 Blocking:               |  
-------------------------+--------------------------------------------------

Comment(by frankjb):

 BTW I wasn't looking to replace antiword, I agree if you need a fast
 default word converter then antiword is probably the way to go, I found
 this patch useful because (apart from the example I submitted) I had
 noticed a client of mine has a heap of word documents from circa 1998 that
 were saved as .doc's by Wordperfect.

 Unfortunately it was there defunct word processor and file format so I
 needed an alternative.
 So since i found this patch useful I just thought I'd share it in case
 anyone else had any weirdness.

 I can't upload the documents for testing as they are client files but this
 is the error I get.

 antiword:
 Word2: fast saved documents are not supported yet

 wvText:
 Could not convert into HTML

 abiword:
 I can all see the text in the body of the document with a few extra little
 "artifacts". The start and end of the text file have perhaps 3-5 lines
 like:
 ¥-?!@?????-???????????€???v
 ??_???????????????????ö???????????????????????????????????????$?????$?$?????$?????$?????$?????$?????2???
 ?R?????R?????R?????R?????R???
 ?\?????R?????l???F?²?????²?????²?????²?????²?????²?????²?????²?????²?????´?????´?????´?????´?????´?????´?????ñ???4?%???:?Ò?????$???????????Ò????

 Open Office:
 Opens perfectly

-- 
Ticket URL: <http://trac.xapian.org/ticket/383#comment:4>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list