[Xapian-devel] [Xapian-discuss] Dealing with image PDF's

Reini Urban rurban at x-ray.at
Thu Jul 31 08:53:24 BST 2008


2008/7/30 Frank Bruzzaniti <frank.bruzzaniti at gmail.com>:
>    // Inspired by http://mjr.towers.org.uk/comp/sxw2text
>    string safefile = shell_protect(file);
>    string cmd = "tifftopnm " + safefile + " | gocr -f UTF8 -";
>    try {
>        dump = stdout_to_string(cmd);
>    } catch (ReadError) {
>        cout << "\"" << cmd << "\" failed - skipping\n";
>        return;
>    }

Can we finally please use configure checks for such weird helper apps,
to avoid runtime exceptions were the system clearly has no such app.

I once provided a huge patch to to do that.
http://thread.gmane.org/gmane.comp.search.xapian.devel/783/

Applied to 1.0.5 it is attached. But there's much more in this patch
so some parts may be stripped. See ChangeLog.
TEXTCAT support for language and charset detection, cached virtual
directories (zip,msg,pst,...) to name a few. Works fine for me for two
years and I haven't touched
it since 0.9.6.
-- 
Reini Urban
http://phpwiki.org/ http://murbreak.at/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xapian-omega-1.0.5a.patch.gz
Type: application/x-gzip
Size: 42949 bytes
Desc: not available
Url : http://lists.xapian.org/pipermail/xapian-devel/attachments/20080731/e1df52e7/attachment-0001.bin 


More information about the Xapian-devel mailing list