[Xapian-tickets] [Xapian] #671: Performance issues when querying over large number of local databases (shards)
Xapian
nobody at xapian.org
Fri May 8 06:42:06 BST 2015
#671: Performance issues when querying over large number of local databases
(shards)
----------------------------------+--------------------------
Reporter: wgreenberg | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Matcher | Version:
Severity: normal | Resolution:
Keywords: sharding performance | Blocked By:
Blocking: | Operating System: Linux
----------------------------------+--------------------------
Comment (by olly):
I've been looking at this today.
I've fixed a few issues with the patch:
* We shouldn't try to preread anything for a table with 0 levels, as then
the root block is a leaf block, and trying to read it as a branch block at
best gives nonsense block numbers to preread, which fail with assertions
on.
* The assertion to check the block number is valid for `GlassTable`
didn't compile (it was just copied from the chert case it appears).
* I've added a cache of the previous block number preread for each table,
and use it to avoid repeated requests for the same block.
* I've made it request the query terms sorted in byte order, which works
nicely with the previous change to reduce the number of `posix_fadvise()`
calls.
* The termname was being used as the key to the postlist table for
prereading, which isn't quite right, though it's close enough for your
benchmarking results to be valid.
I'm a bit surprised that the read-ahead for the record/docdata table makes
a difference, as it seems to just preread each docid right before reading
it, so it seems like it wouldn't really help. But you reported above that
this helps IIUC.
I'll attach an updated patch shortly.
--
Ticket URL: <http://trac.xapian.org/ticket/671#comment:11>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list