[Xapian-discuss] change the doc_id

Olly Betts olly at survex.com
Sat Jan 13 01:26:18 GMT 2007


On Fri, Jan 05, 2007 at 05:38:40PM +0100, Felix Antonius Wilhelm Ostmann wrote:
> the filename is the doc_id and i dont want to rename millions of
> documents during indexing or creating symlinks.

As James suggest, use Xapian::WritableDatabase::replace_document() if
you want to specify the document ids.

However, if you're doing this with millions of documents, it's a good
idea to arrange to add documents in ascending docid order as that will
be significantly faster.

Also, the compression used in the backend will be less effective if
the docids you use are sparse.  No need to totally avoid gaps, but
don't go too mad.

Cheers,
    Olly



More information about the Xapian-discuss mailing list