Xapian::WritableDatabase: commit changes depending on the buffer size

Jean-Francois Dockes jf at dockes.org
Sat Feb 6 17:29:05 GMT 2016


john veter writes:
 > Hi.  I have a lot of documents with different filesize. While indexing, i call commit() every 1000 documents (this is the default value, user can change it). The problem is the following: the indexing process runs  smoothly while indexing small files. The indexer uses about half of the available RAM. But one moment it hits a bunch of bigger documents.  As a result, the RAM usage increases drastically. Finally, i just run out of memory.
 > 
 > I think that the solution is to call commit() depending on the buffer actual size (in megabytes), but not based on the number of the indexed documents. So, is there any way to estimate the size of the buffer of the Xapian::WritableDatabase object?
 > 
 > P.S.  may be somebody have other suggestions how to solve the problem?

I had the same issue quite a long time ago. I changed the indexer to flush
after adding/updating/deleting a document, based on the total amount of
input text, independantly of the number of documents.

Not claiming that this is a rigorous solution, but it apparently solved the
problem, at least nobody seems to complain about memory usage any more.

Cheers,

jf




More information about the Xapian-discuss mailing list