[Xapian-discuss] omindex => Unknown extension

Cedric Jeanneret cedric.jeanneret at camptocamp.com
Mon Apr 6 12:42:47 BST 2009


Hello!
having the same here. Solved by adding some ram in my server.
Maybe external calls can't be done properly, and omindex crashes when launching programs such as antiwor, pdftotext and so on...

Hope this can help you...

Regards,

C.

On Mon, 06 Apr 2009 12:45:40 +0200
"Eric Voisard" <eric.voisard at atisuher.ch> wrote:

> Hi all,
> 
> I'm having a recurrent problem with Omega's indexing.
> When I run omindex, it sometimes misses to recognize the extension of
> some files (.doc, .pdf) and skips them. In the same run, omindex is
> otherwise perfectly able to index other files with same extensions. The
> reason is not clear but it should occur before it selects a content
> converter since for example, if I manually run antiword on a .doc file
> that failed, it works...
> 
> Running omindex:
> Unknown extension: "/srv/xapian/targets/dir/subdir/file name.doc" - skipping
> 
> Manual conversion:
> host:/srv # antiword "/srv/xapian/targets/dir/subdir/file name.doc"
> <..plain text content of the file...>
> host:/srv #
> 
> Note that the target directory is a CIFS mount of a remote Windows
> shared directory. Charset is UTF-8.
> I don't think it has to do with the whitespace in the file name since
> other .doc filenames with whitespaces work.
> 
> Any idea?...
> 
> Thanks in advance, Eric
> ATIS Uher S.A. 
> CH 2046 Fontaines
> ________________________________________________________________________________________________
> 
> This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received this message by mistake please let us know by reply and then delete it from your system; you should not copy it or disclose its contents to anyone. All messages sent to and from ATIS Uher S.A. may be monitored to ensure compliance with internal policies and to protect our business. E-Mails are not secure and cannot be guaranteed to be error free as they can be intercepted, amended, lost or destroyed. Anyone who communicates with us by e-mail is taken to accept these risks.
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss


-- 
Cédric Jeanneret                 |  System Administrator
021 619 10 32                    |  Camptocamp SA
cedric.jeanneret at camptocamp.com  |  PSE-A / EPFL



More information about the Xapian-discuss mailing list