[Xapian-discuss] High memory usage when replacing a document

Vishesh Handa me at vhanda.in
Tue Mar 25 14:33:37 GMT 2014


Hey guys

I've been trying to debug some really high memory usage that we have been 
experiencing when trying to replace a document. The document in question has 
been produced my passing a 25+ mb text file through the term generator.

$ delve . -r 11021 -1 | wc -l
1019413

piping this to a text file amounts to 19 mb.

When indexing this document, the xapian db skyrockets to about 400 mb (ram not 
disk space). I've run it through massif (attached the file - I would recommend 
running it through massif visualizer).

The main offenders seem to be the following -

1. 50 mb - std:strings in ChertTermList
2. 77 mb - Document::add_posting seems to have some internal std::map. I'm 
guessing this is its internal list of terms. Though 77 mb seems like a LOT.
3. 46 mb - ChertWritableDatabase::add_freq-delta
4. 46 mb - ChertWrtiableDatabase::update_mod_plist

(1) seems like it is reading the terms from the Database and keeping them in 
memory

(3) and (4) are probably related to when we create the WritableDb and replace 
the document.

Are there any tips / variables I can configure to trim this memory usage down?

-- 
Vishesh Handa


More information about the Xapian-discuss mailing list