About search result excerpts with HTML tags showing

Olly Betts olly at survex.com
Tue Apr 18 21:33:03 BST 2017


On Tue, Apr 18, 2017 at 12:00:21PM -0400, sorabji at sorabji.com wrote:
> Hi, folks. New to Xapian. I just built a couple of indexes. Search results
> seem good but I can't figure out why the excerpts are showing HTML tags.
> These tags are not present in the original HTML documents. Is there a
> built-in way to either get rid of these tags or have them render as actual
> HTML tags?

There's a bug in the version of the query template:

	$highlight{$snippet{$field{sample}},$terms}

$highlight{TEXT,TERMS} escapes for HTML and highlights TERMS.

$snippet{TEXT} selects a dynamic snippet, escapes for HTML and
highlights query terms in the text.

So we really don't want to do both - replace this with either:

$snippet{$field{sample}}

or:

$highlight{$field{sample},$terms}

(The reason it's like that is the original snippet generation didn't do
HTML escaping or highlighting, but that means we have to parse the text
twice so was changed during the development series.)

Using $snippet{$field{sample}} is probably the better choice (and what
the default template ought to use I think) - if the stored sample is
small then the snippet generation will short-cut, and if you're storing
larger samples then you want to select a smaller snippet from them.

Thanks for reporting this - I'll get a fix in before 1.4.4.

Cheers,
    Olly



More information about the Xapian-discuss mailing list