[Xapian-tickets] [Xapian] #451: Add option to xapian-compact to rebuild postlist chunks
Xapian
nobody at xapian.org
Mon Feb 22 13:23:38 GMT 2010
#451: Add option to xapian-compact to rebuild postlist chunks
-------------------------+--------------------------------------------------
Reporter: richard | Owner: richard
Type: enhancement | Status: new
Priority: normal | Milestone: 1.2.x
Component: Other | Version: SVN trunk
Severity: normal | Blockedby:
Platform: All | Blocking:
-------------------------+--------------------------------------------------
Currently, xapian-compact simply stitches existing chunks in the postlist
and value list together. This is fast, but has two significant drawbacks:
- If document IDs are being preserved (via the --no-renumber option),
xapian-compact cannot merge databases with overlapping document ID ranges
(even if no documents occur in both databases).
- Modifications to a database can result in many small chunks;
recombining these chunks into larger chunks should result in faster
searches. Xapian-compact doesn't currently do this.
I propose adding a new option to xapian-compact: "--rebuild-chunks", which
rebuilds the postlist chunks (and also the valuelist chunks, and the
document length chunks), packing them optimally, and allowing overlapping
document ids. I have implemented a patch which adds this option for the
chert backend, with some tests in api_compact.cc, and this seems to work
well for me.
--
Ticket URL: <http://trac.xapian.org/ticket/451>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list