About search result excerpts with HTML tags showing
Olly Betts
olly at survex.com
Tue Apr 18 21:33:03 BST 2017
On Tue, Apr 18, 2017 at 12:00:21PM -0400, sorabji at sorabji.com wrote:
> Hi, folks. New to Xapian. I just built a couple of indexes. Search results
> seem good but I can't figure out why the excerpts are showing HTML tags.
> These tags are not present in the original HTML documents. Is there a
> built-in way to either get rid of these tags or have them render as actual
> HTML tags?
There's a bug in the version of the query template:
$highlight{$snippet{$field{sample}},$terms}
$highlight{TEXT,TERMS} escapes for HTML and highlights TERMS.
$snippet{TEXT} selects a dynamic snippet, escapes for HTML and
highlights query terms in the text.
So we really don't want to do both - replace this with either:
$snippet{$field{sample}}
or:
$highlight{$field{sample},$terms}
(The reason it's like that is the original snippet generation didn't do
HTML escaping or highlighting, but that means we have to parse the text
twice so was changed during the development series.)
Using $snippet{$field{sample}} is probably the better choice (and what
the default template ought to use I think) - if the stored sample is
small then the snippet generation will short-cut, and if you're storing
larger samples then you want to select a smaller snippet from them.
Thanks for reporting this - I'll get a fix in before 1.4.4.
Cheers,
Olly
More information about the Xapian-discuss
mailing list