[Xapian-tickets] [Xapian] #451: Add option to xapian-compact to rebuild postlist chunks

Xapian nobody at xapian.org
Mon Feb 22 13:23:38 GMT 2010


#451: Add option to xapian-compact to rebuild postlist chunks
-------------------------+--------------------------------------------------
 Reporter:  richard      |       Owner:  richard  
     Type:  enhancement  |      Status:  new      
 Priority:  normal       |   Milestone:  1.2.x    
Component:  Other        |     Version:  SVN trunk
 Severity:  normal       |   Blockedby:           
 Platform:  All          |    Blocking:           
-------------------------+--------------------------------------------------
 Currently, xapian-compact simply stitches existing chunks in the postlist
 and value list together.  This is fast, but has two significant drawbacks:

  - If document IDs are being preserved (via the --no-renumber option),
 xapian-compact cannot merge databases with overlapping document ID ranges
 (even if no documents occur in both databases).

  - Modifications to a database can result in many small chunks;
 recombining these chunks into larger chunks should result in faster
 searches.  Xapian-compact doesn't currently do this.

 I propose adding a new option to xapian-compact: "--rebuild-chunks", which
 rebuilds the postlist chunks (and also the valuelist chunks, and the
 document length chunks), packing them optimally, and allowing overlapping
 document ids.  I have implemented a patch which adds this option for the
 chert backend, with some tests in api_compact.cc, and this seems to work
 well for me.

-- 
Ticket URL: <http://trac.xapian.org/ticket/451>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list