[Xapian-discuss] merge database and maintain order
Mark Clarkson
mark.clarkson at smorg.co.uk
Sun Mar 25 06:16:20 BST 2007
On Sun, 2007-03-25 at 00:02 +0000, Olly Betts wrote:
> Hmm, actually I see a neat hack. If you add the first document to db2
> with a document id at least one more than the last document id of db1
> then the merged document ids will preserve the order within each db
> but put all the documents in db1 before those in db2.
Many thanks for such a prompt reply. I've now implemented this
workaround and it works perfectly - thanks very much!
> Currently
> xapian-compact preserves spans of unused document ids at the start and
> end of the database, but that would be easy to fix.
Again, thankyou for this important piece of information. I did not see
why this would be important but after testing it I can see that I could
run out of document ids in a relatively short space of time depending on
the size of the collection.
I've hacked xapian-compact so that it doesn't add offsets and it seems
to work, but now it will break horribly if I try to merge databases that
have the same document ids.
I guess I'll have to be careful ;-)
Cheers
Mark.
--- bin/xapian-compact.cc.orig 2007-03-25 04:13:36.000000000 +0000
+++ bin/xapian-compact.cc 2007-03-25 04:13:39.000000000 +0000
@@ -152,7 +152,7 @@
if (in->get_entry_count()) {
// PostlistCursor takes ownership of FlintTable in and
// is responsible for deleting it.
- PostlistCursor * cur = new PostlistCursor(in, *offset);
+ PostlistCursor * cur = new PostlistCursor(in, 0);//*offset);
// Merge the METAINFO tags from each database into one.
// They have a key with a single zero byte, which will
// always be the first key.
@@ -322,9 +322,9 @@
Xapian::Database db(srcdir);
// No point trying to merge empty databases!
if (db.get_doccount() != 0) {
- Xapian::docid last = db.get_lastdocid();
- offset.push_back(tot_off);
- tot_off += last;
+ //Xapian::docid last = db.get_lastdocid();
+ //offset.push_back(tot_off);
+ //tot_off += last;
// FIXME: prune unused docids off the start and end of
each range...
sources.push_back(string(srcdir) + '/');
}
More information about the Xapian-discuss
mailing list