[Xapian-tickets] [Xapian] #671: Performance issues when querying over large number of local databases (shards)
Xapian
nobody at xapian.org
Tue Mar 24 02:37:09 GMT 2015
#671: Performance issues when querying over large number of local databases
(shards)
----------------------------------+--------------------------
Reporter: wgreenberg | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Other | Version:
Severity: normal | Resolution:
Keywords: sharding performance | Blocked By:
Blocking: | Operating System: Linux
----------------------------------+--------------------------
Comment (by olly):
I don't think this patch is doing what you think it is.
Each table has a built-in cursor (`C`), which you're using here. It's
used for operations which are implemented using a cursor, but for which
the cursor doesn't need to live on after the we return to the caller -
this mostly just avoids having to create a temporary cursor for every such
operation, but also has the benefit that the blocks needed may already
have been loaded by a previous operation.
The problem with what you're doing is that you just use whatever is in `C`
already. For the root block, that's fine, but once `j < level` we're
searching for a key in whatever block of that level happens to be in the
cursor. Most of the time that won't be the right block, so we'll end up
on the first or last entry the branch block, depending which side of the
right path down the tree we are. So (unless something else happens to be
making sure that `C` points to the right place, you're pre-reading an
essentially arbitrary set of blocks here for the most part.
I guess it gives a performance boost because we will want some of the
blocks in that arbitrary set, and so pre-reading something is better than
pre-reading nothing - we get reads for free while other stuff is going on.
But I think this should be calling `block_to_cursor()` while it descends
the tree, and then instead does the pre-read instead at the lowest level
(which might not be the leaf level necessarily, but there's not much point
iterating after we stop reading blocks).
--
Ticket URL: <http://trac.xapian.org/ticket/671#comment:3>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list