[Xapian-tickets] [Xapian] #363: Avoid using the termlist when calculating percentages in get_mset()
Xapian
nobody at xapian.org
Wed May 6 02:52:34 BST 2009
#363: Avoid using the termlist when calculating percentages in get_mset()
---------------------+------------------------------------------------------
Reporter: richard | Owner: richard
Type: defect | Status: assigned
Priority: normal | Milestone: 1.1.1
Component: Matcher | Version: SVN trunk
Severity: normal | Keywords:
Blockedby: | Platform: All
Blocking: 181 |
---------------------+------------------------------------------------------
Comment(by olly):
Feel free to try it, but I think that's going to be measurably slower.
On the plus side, note that this can potentially be optimised for AND
queries with all leaf children - e.g. MultiAndPostList of terms knows for
certain that all its sub-postlists will match, so no need to recurse them
- we can store a mask to | in.
I think we would probably want a specialised class rather than
{{{vector<bool>}}} as we don't need dynamic resizing (just a runtime-
specified fixed size, which rules out {{{bitset<>}}} which has a compile-
time constant fixed size) but we do want the ability to efficiently do:
{{{v1 |= v2;}}}
But {{{vector<bool>}}} should be fine to prototype what sort of overhead
there is here.
Perhaps better, be lazy about this and only do it for a match which is the
current best via a new method. This has a bad worst case though - if the
weights increase with docid it's even more work than the non-lazy approach
as there are a lot of extra virtual method calls.
This could be optional on a "want percentages" flag, but it would be nice
to use this for "matching terms" too. Which would require calculating and
keeping these for anything in the proto-MSet...
I don't think we want to rush into this, but we clearly can't ship 1.1.1
with the testsuite failing assertions. So I think it's better to find a
quick fix for the synonym4 assertion failure, perhaps along the lines of
the (non-working) patch I tried.
--
Ticket URL: <http://trac.xapian.org/ticket/363#comment:3>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list