help improving relevance of snippets displayed by Omega

Matthew Somerville matthew at mysociety.org
Mon Sep 21 09:28:10 BST 2020


Hi,

Ha, I was reading this thread thinking TheyWorkForYou (which I help
maintain) does highlight terms wherever they are, and then you mentioned it
:)
The code is open source; it is quite old and probably hair-raising, but as
you say it does basically do what you want.
Our Xapian database stores terms/boolean terms/values for the text, and for
the document itself stores only an identifier.
It works by doing the Xapian search, then fetching the resultant IDs from
the database, then it boils down to calling
prepare_search_result_for_display on each result:
https://github.com/mysociety/theyworkforyou/blob/master/www/includes/easyparliament/hansardlist.php#L1279-L1306
Which uses two functions to then work out the extract,
position_of_first_word:
https://github.com/mysociety/theyworkforyou/blob/master/www/includes/easyparliament/searchengine.php#L562
and highlight:
https://github.com/mysociety/theyworkforyou/blob/master/www/includes/easyparliament/searchengine.php#L475
They stem the entered words and loop through the speeches to find the
match/thing to highlight.

ATB,
Matthew

On Sun, 20 Sep 2020 at 03:57, Michael Decerbo <michaeldecerbo at gmail.com>
wrote:

> In general, I'm wondering how best to use Xapian so that, at query time, my
> application can display an excerpt that is relevant to the query, not a
> sample chosen at indexing time without regard to the query that may or may
> not contain the query term(s). For example, TheyWorkForYou.com is listed on
> xapian.org as a site using Xapian, and when I enter a single-term query on
> that site the document excerpts provided as part of the search results
> invariably include highlighted words, possibly stemmed, responsive to the
> query. That's the effect I would like to achieve.
>
> If you can think of any sample code that I should refer to, or even if you
> could just suggest the broad outlines of a solution, I would be very
> grateful.
>


More information about the Xapian-discuss mailing list