[Xapian-discuss] [Omindex] How to associate a web URL in search results based on a document stored as a local file?

Olly Betts olly at survex.com
Tue May 16 09:46:58 BST 2006


On Mon, May 15, 2006 at 03:43:21PM -0800, oscaruser at programmer.net wrote:
> I download a web page and want to add it to the index. I am using
> omindex (as below). When I search for the document, I see in search
> results that the hyper text link URL is to a file (e.g.
> http://www.mysite.com/shoe/tennis_shoe/tennis_shoe.html).

That's what you asked for when you said:

 --url 'http://www.mysite.com/shoe/tennis_shoe'

> What I want to be able to do is download the HTML file, save it, have
> it appear with a link back to the original web URL.

You mean like Google's "cached copy"?

After indexing, save each page using a filename which can be derived
from the URL (e.g. MD5SUM of the URL).  Then you can write a simple
template page in PHP or similar (called cached.php, say) which takes a
parameter "url" and loads the text of the cached page and displays it
with a link to the url.

Then you can just set the link in the omegascript query template to be
something like:

<a href="$html{cached.php?url=$field{url}}">$field{title}</a>

Cheers,
    Olly



More information about the Xapian-discuss mailing list