[Xapian-tickets] [Xapian] #671: Performance issues when querying over large number of local databases (shards)

Xapian nobody at xapian.org
Fri May 8 06:42:06 BST 2015


#671: Performance issues when querying over large number of local databases
(shards)
----------------------------------+--------------------------
 Reporter:  wgreenberg            |             Owner:  olly
     Type:  defect                |            Status:  new
 Priority:  normal                |         Milestone:
Component:  Matcher               |           Version:
 Severity:  normal                |        Resolution:
 Keywords:  sharding performance  |        Blocked By:
 Blocking:                        |  Operating System:  Linux
----------------------------------+--------------------------

Comment (by olly):

 I've been looking at this today.

 I've fixed a few issues with the patch:

  * We shouldn't try to preread anything for a table with 0 levels, as then
 the root block is a leaf block, and trying to read it as a branch block at
 best gives nonsense block numbers to preread, which fail with assertions
 on.

  * The assertion to check the block number is valid for `GlassTable`
 didn't compile (it was just copied from the chert case it appears).

  * I've added a cache of the previous block number preread for each table,
 and use it to avoid repeated requests for the same block.

  * I've made it request the query terms sorted in byte order, which works
 nicely with the previous change to reduce the number of `posix_fadvise()`
 calls.

  * The termname was being used as the key to the postlist table for
 prereading, which isn't quite right, though it's close enough for your
 benchmarking results to be valid.

 I'm a bit surprised that the read-ahead for the record/docdata table makes
 a difference, as it seems to just preread each docid right before reading
 it, so it seems like it wouldn't really help.  But you reported above that
 this helps IIUC.

 I'll attach an updated patch shortly.

--
Ticket URL: <http://trac.xapian.org/ticket/671#comment:11>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list