[Xapian-discuss] PHP indexing, what's the PHP method for indexscript

James Aylett james-xapian at tartarus.org
Wed Jan 16 14:57:48 GMT 2008


On Wed, Jan 16, 2008 at 01:17:51AM -0800, athlon athlonf wrote:

> I've managed to correctly use PHP-bindings to index my database and
> I'm really amazed by the speed.

That's excellent news - congratulations.

> Apparantly, the method of using a perlscript like dbi2omega to get
> the inputfile and then use scriptindex to parse and index it is much
> slower.  Indexing with PHP took 9 hours to complete on my
> developmachine (amd64 3800 with 2GB of ram and 5HDD-raid5) for 3
> million documents, with less load.

A two-part indexing process, such as scriptindex uses, is often going
to be slower. There also may be differences in memory consumption and
other things between the different bindings and languages.

> Indexing with dbi2omega->scriptindex takes more than 24 hours and
> it's not even at 40% (i've made several intermediate files) at load
> 5. And this on a AMD dual opteron 246 with raid1 and 3GB of ram.

Load 5 suggests something's wrong, because dbi2omega and scriptindex
are both linear processes. Are you running several instances in
parallel in some way?

I believe that right now, none of the supplied Xapian indexing scripts
or binaries will go significantly above a load of 1, unless you have
other issues or something else happening on the machine.

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list