[Xapian-devel] Opensource Websearch Engine Project
charlie at juggler.net
Wed Oct 27 09:15:24 BST 2010
On 26/10/2010 17:55, Pierre-Louis Dehapiot wrote:
> I'm Pierre-Louis Dehapiot from Paris, France. I am studying computing programming at the ECE (a french school) and this year, the topic of my project is "google and indexing".
> To summarize, it deals with creating my own google in only one year :p !
> I saw that you made yourself an opensource websearch engine written in C (Xapian).
> I already made the php/CSS interface for my own project only in French for the moment but in English soon ! (you can have a look here : http://pti.pl4tipus.com)
> As you can see, it's very "google-like" : this is what the topic deals with.
> If you have few minutes to answer me, I think I need some tips about "how to make an indexing engine".
> I know how it works approximately but i need more details about the difficulties of the project. All the tips you can give me can be very useful.
> Can you help me ?"
> I am glad of your future support.
> Pierre-Louis Dehapiot
(Apologies, I posted this to xapian-discuss by mistake)
You may be interested to know that Xapian was originally created to
power a web search engine (half a billion web pages or thereabouts).
You've got a pretty steep learning curve to be honest: you're first
going to need to learn about web crawling (note that Xapian does not
include a web crawler, although there are plenty of open source ones out
there - Heretrix is a good example), and how to keep your index clean
and current. Indexing webpages into Xapian isn't that hard - Xapian's
Omega application will do that for you if you don't want to control
Xapian directly. You can then use Xapian's PHP bindings to hook up to
your existing front end. Depending on how many pages you want to index,
you may also have to learn how to spread your index across multiple
I wish you luck with your project - I would start by reading about how
to build and use web crawlers, then try creating a small searchable
index using Xapian. I'm sure others on this list will help with any
questions, but you should do some research first.
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
More information about the Xapian-devel