[Xapian-discuss] Very far out and static get_matches_estimated
Matthew Somerville
matthew at mysociety.org
Thu Jun 11 00:29:06 BST 2009
Hi,
I'm getting quite odd results using get_matches_estimated() that I
haven't seen before; we've just added a bunch of new data to the
database. Xapian 1.0.7, checkatleast is set to 100.
The database will get new stuff added to it automatically around 8.30am
BST, so it's possible this might affect the links I provide, I guess.
But I'll note what is currently happening as I write.
http://www.theyworkforyou.com/search/?pop=1&s=statistics+19950101..19951231
currently returns 1-20 of 14,678; page 18 gives 341-360 of 14,678:
http://www.theyworkforyou.com/search/?pop=1&s=statistics+19950101..19951231&p=18
But then page 19 gives 361-362 of 362, which is correct:
http://www.theyworkforyou.com/search/?s=statistics+19950101..19951231&p=19
So the estimate is wildly out for all pages until we get to the actual
number of results. Changing the sort to relevance instead of reverse
date gives a different far out number, but the effect is the same.
Without the date range limiting, the initial estimate is 43,612, and
this slowly changes as I up the page count until it gets to the correct
result of 43,537 (good initial estimate!), as I'd expect.
It's also set by default to collapse per debate, but turning that off
doesn't make any difference, it gives initially "1-20 of 30,249", up to
"721-740 of 30,249" but then "741-746 of 746".
Any ideas?
ATB,
Matthew
More information about the Xapian-discuss
mailing list