[Xapian-discuss] floating-point issues with set_sort_by_relevance_then_value? (1.2.3, BM25 k1=0)
Olly Betts
olly at survex.com
Mon Nov 1 10:58:49 GMT 2010
On Mon, Nov 01, 2010 at 01:59:42AM +0100, Marinos Yannikos wrote:
> This apparently prevents floating point precision issues in the last line
> of get_sumpart() [which calculates termweight * wdf_double * 1 /
> wdf_double].
Yes, for some values of wdf_double and termweight, this doesn't give
exactly termweight. We should do the division, and scale termweight by
the result.
I've reproduced this issue and I'm currently working on a fix.
> It also speeds up my case slightly. ;-)
How much is "slightly"? Or did you just mean it's doing less work,
rather than that there's a measurable speed-up.
> In order to prevent more such issues, it might be a good idea to round
> weights to a few fractional digits (10 should be enough) before using
> them as sort keys.
Rounding isn't a magic solution to such issues, and explicitly rounding
all the weights is extra work. I think it's better to focus on getting
the calculations right rather than trying to disguise any problems.
Cheers,
Olly
More information about the Xapian-discuss
mailing list