[Xapian-tickets] [Xapian] #451: Add option to compaction to rebuild postlist chunks

Xapian nobody at xapian.org
Tue Jan 28 05:09:53 GMT 2020


#451: Add option to compaction to rebuild postlist chunks
-----------------------------+------------------------------------
 Reporter:  Richard Boulton  |             Owner:  Richard Boulton
     Type:  enhancement      |            Status:  new
 Priority:  normal           |         Milestone:  1.5.0
Component:  Library API      |           Version:  git master
 Severity:  normal           |        Resolution:
 Keywords:                   |        Blocked By:
 Blocking:                   |  Operating System:  All
-----------------------------+------------------------------------
Changes (by Olly Betts):

 * version:  SVN trunk => git master
 * milestone:  1.4.x => 1.5.0

Comment:

 > If document IDs are being preserved (via the --no-renumber option),
 xapian-compact cannot merge databases with overlapping document ID ranges
 (even if no documents occur in both databases).

 I wonder if this one is really an unreasonable limitation.  Nobody's
 complained about it since that I can recall.  Did you have a use case for
 it?

 > Modifications to a database can result in many small chunks; recombining
 these chunks into larger chunks should result in faster searches. Xapian-
 compact doesn't currently do this.

 Ideally these would get combined in the normal course of operations, but
 even then there's still the case of merging several databases and a term
 occurring a small number of times in each - then we potentially have one
 small postlist chunk per input database.

 217a67f792a93ceb085749c42a66c8829f1a9573 improves this for honey on git
 master - now adjacent input chunks are spliced together until doing so
 would exceed HONEY_POSTLIST_CHUNK_MAX.  We don't try to split input chunks
 currently so it's not a full version of what's proposed here, but this
 splicing can be done without decoding so it's faster.

 At this point I don't think we'd do this for glass or 1.4.x, but rather
 for honey in the next release series.
-- 
Ticket URL: <https://trac.xapian.org/ticket/451#comment:5>
Xapian <https://xapian.org/>
Xapian


More information about the Xapian-tickets mailing list