[Xapian-discuss] Xapian with djvu files?
Olly Betts
olly at survex.com
Sat Jan 19 02:06:40 GMT 2008
On Thu, Jan 17, 2008 at 03:22:12AM +0000, Olly Betts wrote:
> On Tue, Jan 15, 2008 at 10:39:49AM +1100, John Pye wrote:
> > You can use a free online OCR tool to generate DJVU files that include
> > text in them:
> >
> > http://any2djvu.djvuzone.org/
>
> But I don't have scans of documents containing non-ASCII characters, so
> this isn't going to help much.
>
> I'm happy to add support for djvu (or any other format with a suitable
> filter program), but I feel uneasy about doing so when I have little or
> no sample data to test with.
Aha - I've randomly discovered the magazines "gallery", which has some
more substantial examples, almost all with text layers, and most of
which contain some non-ASCII characters:
http://djvu.org/gallery/magazines.php
These seem to work, so I've added support for DjVu.
Cheers,
Olly
More information about the Xapian-discuss
mailing list