[Xapian-discuss] quick-and-dirty web search for a bunch of PDFs?
John Pye
john.pye at student.unsw.edu.au
Tue May 16 08:16:55 BST 2006
Hi all
I'm a newbie with Xapian. I have just a simple goal of creating a
quick-and-dirty web-based fulltext search for a bunch of PDF files that
I've collected from various conference CDs.
Is this a use-case that Omega covers, or do I need to use Xapian
directly? Where can I find some documentation about the capabilities of
Omega?
Assuming Omega doesn't do this, would it be reasonably straightforward
to attempt to write something using the python bindings? Has anyone done
a HOWTO for this pretty basic use-case? I presume I will need to provide
a PDF-to-text filter of some sort, eg poppler/xpdf or similar?
I'd really appreciate any suggestions you can giving getting started
here. Xapian looks really nice and I'm looking forward to using it.
Cheers
JP
--
John Pye
School of Mechanical and Manufacturing Engineering
The University of New South Wales
Sydney NSW 2052 Australia
t +61 2 9385 5127
f +61 2 9663 1222
mailto:john.pye_AT_student_DOT_unsw.edu.au
http://pye.dyndns.org/
More information about the Xapian-discuss
mailing list