Using a document id as metadata key and merges

Jean-Francois Dockes jf at dockes.org
Thu Dec 12 08:51:44 GMT 2024


Hi,

Following a discussion a few years ago, Recoll stores the documents text
contents in database metadata entries, with keys derived from document ids.

More recently an index creation method using several temporary indexes
merged on completion was implemented. This is still a bit experimental. It
brings a significant speed increase in some cases.

I just realised that the merge lost many metadata entries because of the
document id collisions (I was just using add_document() on the temporary
dbs). It was not immediately obvious because this only affects snippets
generation. 

Would using replace_document() on the temporary dbs, with unique document
ids (modulo) ensure that the document ids are preserved during the merge so
that the metadata keys remain valid ?

Or is there another obvious approach which I am missing ?

Cheers,

J.F. Dockes



More information about the Xapian-discuss mailing list