[Xapian-discuss] Python bindings not freeing memory during
indexing
Richard Boulton
richard at lemurconsulting.com
Sat Jul 7 18:23:16 BST 2007
EJ Johnson wrote:
> Hi list,
>
> I'm new to Xapian (great stuff!!) but am running into a problem that I haven't seen explicitly mentioned on the list before.
>
> I'm using the Python bindings for Xapian 1.0.1 on Ubuntu Dapper 6.06 LTS using xapian.org as my repository. My hardware is an HP DL385 G2, two dual-core AMD Opterons with 8G RAM.
>
> I'm trying to index a good chuck of documents and have a python indexer iterating through the docs and adding them to the DB. I get up to about 45,000 docs and it croaks. Sometimes it throws some malloc error and the last time it just segfaulted. Essentially, the indexer process continues to use more and more RAM until it dies. It really only makes it up to about 3G of RAM before dying and it never hits swap.
That's odd: I'd expect it to be able to get up past the amount of
physical memory before being killed off: have you been able to determine
why is dies? ie, is there an OOM killer running, or is it due to an
internal error?
I've indexed some fairly large datasets with Xapian 1.0.1 using the
Python bindings (around 20Gb databases), with no problems like this.
Which version of Python are you using? I wonder if the problem could be
python, rather than Xapian: it's fairly easy to fail to delete objects
in python, and if there was a memory leak there, that could be the cause
of the problem. Or maybe something in Python is
If you ant to send your indexing script, I'll take a brief look at it
and see if there's anything obvious wrong (probably send it direct to
me, since the list won't accept attachments). I've also got a copy of
valgrind set up to run python programs, so if you send me a couple of
documents of sample data, I can try that out.
> So, you can see that the number of docs, disk space, doc length, etc are basically the same.
Well, actually the average document length is quite significantly
smaller in the first set of log entries; something odd is definitely
going on there.
> My next step was to recompile Xapian and the Python bindings from source (1.0.2) is out now and see if that helps. Any other thoughts or suggestions are greatly appreciated!
There are updated packages available in the xapian.org/debian
repository, too, if you want to try those.
--
Richard
More information about the Xapian-discuss
mailing list