[Xapian-discuss] Error msg xapian-compact: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation

Olly Betts olly at survex.com
Tue Jun 13 01:35:11 BST 2006


On Mon, Jun 12, 2006 at 03:12:21PM -0800, oscaruser at programmer.net wrote:
> I can increase this number, but my code stops to run scriptindexer for
> each single page. If I understand the function of
> XAPIAN_FLUSH_THRESHOLD, I would see an improved performance if I run
> scriptindex against several files at once, but not against single
> pages. I suppose buffering up 100 or so pages would help that.

Most definitely.  If you want to speed up indexing, the very first step
is to apply changes in large batches.  You're forcing the database to
flush for every single document - worse than that you're opening and
closing the database for every single document!

100 or so is probably far too few if you want to maximise throughput.
For most large systems, the default limit of 10000 is likely to be too
low...

The case of appending documents to a database in large batches is
naturally efficient, but also currently has had the most optimisation
work done since it's a very common speed critical case.

Cheers,
    Olly



More information about the Xapian-discuss mailing list