[Xapian-discuss] Xapian with djvu files?

Olly Betts olly at survex.com
Sat Jan 19 02:06:40 GMT 2008


On Thu, Jan 17, 2008 at 03:22:12AM +0000, Olly Betts wrote:
> On Tue, Jan 15, 2008 at 10:39:49AM +1100, John Pye wrote:
> > You can use a free online OCR tool to generate DJVU files that include
> > text in them:
> > 
> > http://any2djvu.djvuzone.org/
> 
> But I don't have scans of documents containing non-ASCII characters, so
> this isn't going to help much.
> 
> I'm happy to add support for djvu (or any other format with a suitable
> filter program), but I feel uneasy about doing so when I have little or
> no sample data to test with.

Aha - I've randomly discovered the magazines "gallery", which has some
more substantial examples, almost all with text layers, and most of
which contain some non-ASCII characters:

http://djvu.org/gallery/magazines.php

These seem to work, so I've added support for DjVu.

Cheers,
    Olly



More information about the Xapian-discuss mailing list