[Xapian-tickets] [Xapian] #682: Collapsing when sorting by value unnecessarily slow
Xapian
nobody at xapian.org
Wed Jun 13 02:45:57 BST 2018
#682: Collapsing when sorting by value unnecessarily slow
---------------------+---------------------------
Reporter: olly | Owner: olly
Type: defect | Status: closed
Priority: normal | Milestone: 1.5.0
Component: Matcher | Version: 1.2.20
Severity: normal | Resolution: fixed
Keywords: | Blocked By:
Blocking: | Operating System: All
---------------------+---------------------------
Changes (by olly):
* status: assigned => closed
* resolution: => fixed
Comment:
That custom implementation is now in `common/heap.h`.
I did contact Gareth successfully shortly after the last comment and he
was going to try to test with git master, but he's not got back to us
since.
I had a quick look at the code on the RELEASE/1.4 branch. Two things
happen there each time we find a better candidate for a collapse key:
* We linearly scan through the array the heap is in until we find the
previous candidate (or don't find it if it has been eliminated from the
proto-MSet already). On git master there's another layer of indirection
here which allows us to do this in O(1).
* We flag the heap and in need of rebuilding next time we need to use it
as a heap, and a call to `make_heap` needs most 3*N comparisons, and worst
case is that we always have to rebuild. With the custom heap
implementation we can use `Heap::siftdown()` to replace the element while
keeping the heap valid in at most 2*log(N) comparisons.
Using the custom heap implementation looks fairly straightforward (I
refitted it throughout git master recently and that wasn't hard) but that
still leaves us with a worst case of O(m*n) where m is the requested MSet
size and n the number of documents in the database.
I think trying to retrofit the additional layer of indirection into the
RELEASE/1.4 branch would be too disruptive, especially as we're a long way
into the 1.4 stable release series.
Changing the heap implementation would probably help, but it carries some
risk for all users while fixing a problem which only users using
collapsing, and not all those users even. And it is unlikely to deliver
the substantial change which git master should.
So I think we don't try to backport anything for this issue.
--
Ticket URL: <https://trac.xapian.org/ticket/682#comment:5>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list