[Xapian-discuss] High memory usage when replacing a document
Vishesh Handa
me at vhanda.in
Tue Mar 25 14:33:37 GMT 2014
Hey guys
I've been trying to debug some really high memory usage that we have been
experiencing when trying to replace a document. The document in question has
been produced my passing a 25+ mb text file through the term generator.
$ delve . -r 11021 -1 | wc -l
1019413
piping this to a text file amounts to 19 mb.
When indexing this document, the xapian db skyrockets to about 400 mb (ram not
disk space). I've run it through massif (attached the file - I would recommend
running it through massif visualizer).
The main offenders seem to be the following -
1. 50 mb - std:strings in ChertTermList
2. 77 mb - Document::add_posting seems to have some internal std::map. I'm
guessing this is its internal list of terms. Though 77 mb seems like a LOT.
3. 46 mb - ChertWritableDatabase::add_freq-delta
4. 46 mb - ChertWrtiableDatabase::update_mod_plist
(1) seems like it is reading the terms from the Database and keeping them in
memory
(3) and (4) are probably related to when we create the WritableDb and replace
the document.
Are there any tips / variables I can configure to trim this memory usage down?
--
Vishesh Handa
More information about the Xapian-discuss
mailing list