[Xapian-discuss] Moving to 1.0.x

Olly Betts olly at survex.com
Mon Oct 8 22:43:22 BST 2007


On Mon, Oct 08, 2007 at 04:29:52PM -0400, Mike Boone wrote:
> I had to look it up. The old machine is on 0.8.5 using Red Hat
> Enterprise 2.1 and the infamous GCC 2.96.

OK, that was the last 0.8.x release which helps narrow things down a
bit.

> The new box is running Red Hat Enterprise 4, for which the default
> install of PHP is a Red-Hat-patched version 4.3.9. I'm hoping to leave
> it alone so I can let Red Hat worry about the security updates. If I
> find something that would really work better under PHP5, I might just
> switch.

You may find they stop supporting it fairly soon -
http://www.php.net/archive/2007.php says:

    The PHP development team hereby announces that support for PHP 4
    will continue until the end of this year only. After 2007-12-31
    there will be no more releases of PHP 4.4. We will continue to make
    critical security fixes available on a case-by-case basis until
    2008-08-08. Please use the rest of this year to make your
    application suitable to run on PHP 5.

PHP doesn't have the best security track record, so I'd be suprised if
Linux distros took on the burden of security support after the PHP team
give up.

> So it looks like flushing isn't doing anything to give back memory, as
> my indexer would be running flush around 23 times.

Note that at present, flushing doesn't release memory to the OS - it
only gets returned to the C++ memory allocation system.  So you won't
see memory usage drop, but it shouldn't climb without limit.

> The memory usage always is up, at least by my output which shows
> memory usage for every 1,000 documents.

How are you measuring memory usage BTW?

> I leave the WritableDatabase object open the whole
> time; I have not tried to close it and reopen it after X documents, or
> limit the process to X documents and then restart it for the next set
> of X.

Hmm, perhaps there's a memory leak in the PHP wrappers.  If you remove
the calls to add_document() and/or replace_document(), do you also see
increasing memory usage?

> Are there any benchmarking tools that would give me an idea of the
> speed of just Xapian on a system?

Richard is working on something like this, but it's not quite ready yet.

> My standard search runs via PHP and
> once Xapian returns results, the content of the matching documents is
> pulled from MySQL. So there are a lot of variables involved in my
> estimate of the search speed. It might be MySQL or PHP or Apache that
> need tuning, so it would be nice to eliminate Xapian from the
> troubleshooting.

Indeed, though you should be able to get close by trying everything on
the same machine with the same versions of everything else.

Cheers,
    Olly



More information about the Xapian-discuss mailing list