[Xapian-discuss] quick-and-dirty web search for a bunch of PDFs?

John Pye john.pye at student.unsw.edu.au
Tue May 16 08:16:55 BST 2006


Hi all

I'm a newbie with Xapian. I have just a simple goal of creating a
quick-and-dirty web-based fulltext search for a bunch of PDF files that
I've collected from various conference CDs.

Is this a use-case that Omega covers, or do I need to use Xapian
directly? Where can I find some documentation about the capabilities of
Omega?

Assuming Omega doesn't do this, would it be reasonably straightforward
to attempt to write something using the python bindings? Has anyone done
a HOWTO for this pretty basic use-case? I presume I will need to provide
a PDF-to-text filter of some sort, eg poppler/xpdf or similar?

I'd really appreciate any suggestions you can giving getting started
here. Xapian looks really nice and I'm looking forward to using it.

Cheers
JP

-- 
John Pye
School of Mechanical and Manufacturing Engineering
The University of New South Wales
Sydney  NSW 2052  Australia
t +61 2 9385 5127
f +61 2 9663 1222
mailto:john.pye_AT_student_DOT_unsw.edu.au
http://pye.dyndns.org/




More information about the Xapian-discuss mailing list