[Xapian-discuss] Installing Omega on Ubuntu
peter.masiar at yale.edu
Fri Mar 17 03:48:30 GMT 2006
Quoting Philip Neustrom <philipn at gmail.com>:
> On the lines of python-xapian, if you want a simple starting point,
> check the example search/index for the bindings. You also might be
> interested in taking a look at some code I wrote that does search
> indexing/querying/result context in python. It's a part of Sycamore,
> which is a wiki written in python
> (not yet 'released', but you can
> check out the current code)
> http://daviswiki.org/repos/trunk/Sycamore/search.py that's the simple
> search/index, and
> http://daviswiki.org/repos/trunk/Sycamore/wikiaction.py has a function
> do_search as well as print_context.
Sure I will look into it. Thanks, Philip.
(I emailed you about daviswiki off-list. I guess off-topic here on xapian)
> I thought about spinning this
> stuff off as a somewhat independent module for people who want to have
> simple python-based indexing/searching - maybe you'd like to help on
> that front?
> I'm not entirely sure what omega does as I've only used it briefly.
> Does it crawl? If so, you'd have to write a crawler. What are you
> needs? Is the site dynamic?
My needs are pretty basic. Many people are like me.
There is couple dozen websites. Some are public and open, some are free
but require login/cookie. For every site, there is URL (page or query)
as a start for web crawler. Program need to read them, index them,
and provide web frontend to search. List of websites is pretty static,
could be in config file.
It would be nice to have possibility to add a URL, have it crawled/indexed
in 10 mins and ready to be searched, but it's not required.
Basically, small private google. :-)
For crawler, I was advised to use wget. On xapian wiki, I noticed
utility dig or htdig, I will need to research it and compare.
I understand that my crawler needs to be custom. It is my custom way to
prepare files to be indexed.
I was looking for standard indexer backend with web search interface,
which Omega fits perfectly.
More information about the Xapian-discuss