[Xapian-tickets] [Xapian] #326: Change doc length chunk encoding so skipping through a chunk is better than O(n)
Xapian
nobody at xapian.org
Sun Feb 4 21:21:46 GMT 2018
#326: Change doc length chunk encoding so skipping through a chunk is better than
O(n)
---------------------------+------------------------------
Reporter: richard | Owner: olly
Type: defect | Status: closed
Priority: normal | Milestone: 1.5.0
Component: Backend-Glass | Version: SVN trunk
Severity: normal | Resolution: fixed
Keywords: | Blocked By:
Blocking: | Operating System: All
---------------------------+------------------------------
Changes (by olly):
* status: assigned => closed
* resolution: => fixed
Comment:
The new honey backend which I recently merged to master stores document
lengths with a fixed width encoding.
Currently it's a fixed 4 bytes per entry, which is somewhat wasteful but
actually for glass a typical document length entry needs 3 bytes (because
the document length values are typically >= 128 and < 16384 which takes 2
bytes, and we use another byte to store the docid delta, which is always 0
unless there are deleted documents.
My plan is to allow the width to vary per chunk - not sure if to byte or
bit granularity, or somewhere between. Bit granularity is obvious more
compact, but actually the doclen data is not a huge amount of data so the
additional saving may not justify the increased complexity (and hence
encoding and decoding time).
But while there's scope to improve further, the issue here is now
addressed so closing.
--
Ticket URL: <https://trac.xapian.org/ticket/326#comment:26>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list