[Xapian-discuss] [Xapian-devel] Dealing with image PDF's
Reini Urban
rurban at x-ray.at
Thu Jul 31 12:40:13 BST 2008
2008/7/31 Richard Boulton <richard at lemurconsulting.com>:
> Reini Urban wrote:
>>
>> 2008/7/30 Frank Bruzzaniti <frank.bruzzaniti at gmail.com>:
>>>
>>> // Inspired by http://mjr.towers.org.uk/comp/sxw2text
>>> string safefile = shell_protect(file);
>>> string cmd = "tifftopnm " + safefile + " | gocr -f UTF8 -";
>>> try {
>>> dump = stdout_to_string(cmd);
>>> } catch (ReadError) {
>>> cout << "\"" << cmd << "\" failed - skipping\n";
>>> return;
>>> }
>>
>> Can we finally please use configure checks for such weird helper apps,
>> to avoid runtime exceptions were the system clearly has no such app.
>>
>> I once provided a huge patch to to do that.
>> http://thread.gmane.org/gmane.comp.search.xapian.devel/783/
>
> Perhaps the patch should go in a ticket; that way, we're less likely to
> forget about it.
Ticket? Uh my fault. I never though about that. Sounds useful :)
http://trac.xapian.org/ticket/285
Should probably be splitted into multiple tickets, patches.
>> Applied to 1.0.5 it is attached. But there's much more in this patch
>> so some parts may be stripped. See ChangeLog.
>> TEXTCAT support for language and charset detection, cached virtual
>> directories (zip,msg,pst,...) to name a few. Works fine for me for two
>> years and I haven't touched
>> it since 0.9.6.
>
> Sounds useful. However, I'm not sure that configure time is the right place
> to check for the existence of helper apps. In particular, quite often
> omindex is installed from a pre-compiled package (for example, in Debian),
> and the helper apps present at configure time need therefore bear no
> relation to those present at runtime.
>
> Perhaps omindex could be improved to handle missing helper applications -
> I've not actually looked at how it handles this recently, so I don't know if
> there's actually a problem, but if there is, the correct fix seems to me to
> be to handle missing helper applications gracefully, rather than disable
> them at configure time. Perhaps omindex would keep a cache, during each
> run, of the helper applications which have been found to be missing, so it
> would only attempt to run each one once.
I solved the preconfigured binary package problem with packaging dependencies.
I cache would be overkill.
Another advantage of such a config setting would be to hardcode the
actual helper location and don't search the whole PATH at runtime for it.
--
Reini Urban
http://phpwiki.org/ http://murbreak.at/
More information about the Xapian-discuss
mailing list