[Xapian-discuss] Incorrect get_matches_estimated() of Xapian::Mset

Hightman(马明练) hightman at zuaa.zju.edu.cn
Thu Sep 20 12:56:52 BST 2007


Hello, As I know, get_matches_estimated() return an estimate for the number of documents with matches the query.

But now, I found it get a disparity between the return value and real mathced number. For an example: the real matched number is 58, but the return value is 458; so when the users click the hinder page, get a blank page ... so they often complain to me.

I found that the main reason is that the query with a high-matched boolean TERM。

E.g: 
There are only two data-type of all documents, every document belong to data-type I or data-type II,the Number of documents with data-type I is much greater than the data-type II。Now I do a test, query by some keywords only, the mathced number returned by get_matches_estimated() is "500", when I add the boolean condition, the mathced number returned is 400 for data-type I and 100 for data-type II; but really number of them is just reverse。

So I get an conclusion, XAPIAN count the estimate number by the percentage of the FILTER term in all documents .... :(   How can I fixed this error??





More information about the Xapian-discuss mailing list