[Xapian-tickets] [Xapian] #400: Optimise AND_MAYBE when the RHS has a maxweight of 0
Xapian
nobody at xapian.org
Tue Dec 3 06:47:28 GMT 2019
#400: Optimise AND_MAYBE when the RHS has a maxweight of 0
-----------------------------+-------------------------------
Reporter: Richard Boulton | Owner: Olly Betts
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.5.0
Component: Matcher | Version: git master
Severity: minor | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-----------------------------+-------------------------------
Changes (by Olly Betts):
* status: new => assigned
* version: SVN trunk => git master
* milestone: 1.4.x => 1.5.0
Comment:
The case where the scale factor is zero (which is what the testcases in
the patch test) has been handled since 1.4.10
(9e1023ab5d28532e649715754d5f000038e98f2f) - I tested with the new
testcases applied to git master and they passes. This optimisation
doesn't cause the problem with percentages highlighted above since we
don't count subqueries for which factor == 0.
We don't currently handle the case where the maxweight of the RHS is zero
or becomes zero, but I think that's quite easy to do:
{{{#!diff
diff --git a/xapian-core/api/queryinternal.cc b/xapian-
core/api/queryinternal.cc
index c5148ca350e0..4c888d8b0af7 100644
--- a/xapian-core/api/queryinternal.cc
+++ b/xapian-core/api/queryinternal.cc
@@ -2357,10 +2357,18 @@ QueryAndMaybe::postlist(QueryOptimiser * qopt,
double factor) const
}
OrContext ctx(qopt, subqueries.size() - 1);
do_or_like(ctx, qopt, factor, 0, 1);
+ Xapian::termcount save_total_subqs = qopt->get_total_subqs();
unique_ptr<PostList> r(ctx.postlist());
if (!r.get()) {
RETURN(l.release());
}
+ if (r->recalc_maxweight() == 0.0) {
+ // The RHS can't contribute any weight, so can be discarded.
Reset
+ // total_subqs in case we counted any in the RHS so that
percentages
+ // don't get messed up.
+ qopt->set_total_subqs(save_total_subqs);
+ RETURN(l.release());
+ }
RETURN(new AndMaybePostList(l.release(), r.release(),
qopt->matcher, qopt->db_size));
}
}}}
I'm not sure if we actually need to restore total_subqs - it seems there
probably can't be any weighted terms in the RHS if its overall maxweight
is zero, but maybe with a custom weighting scheme there could be.
Let's try to get test coverage for the above and apply for 1.5.0. I don't
think this additional case is worth patching 1.4.x for.
And I think we can not worry about the case where the max weight starts
off non-zero but becomes zero during the match - in such a situation it
would be more obvious to implement the !PostingSource to simply signal it
has reached its end rather than just setting its maxweight to zero, and
that will be handled efficiently already.
--
Ticket URL: <https://trac.xapian.org/ticket/400#comment:4>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list