[Xapian-discuss] PHP indexing,
what's the PHP method for indexscript
James Aylett
james-xapian at tartarus.org
Wed Jan 16 14:57:48 GMT 2008
On Wed, Jan 16, 2008 at 01:17:51AM -0800, athlon athlonf wrote:
> I've managed to correctly use PHP-bindings to index my database and
> I'm really amazed by the speed.
That's excellent news - congratulations.
> Apparantly, the method of using a perlscript like dbi2omega to get
> the inputfile and then use scriptindex to parse and index it is much
> slower. Indexing with PHP took 9 hours to complete on my
> developmachine (amd64 3800 with 2GB of ram and 5HDD-raid5) for 3
> million documents, with less load.
A two-part indexing process, such as scriptindex uses, is often going
to be slower. There also may be differences in memory consumption and
other things between the different bindings and languages.
> Indexing with dbi2omega->scriptindex takes more than 24 hours and
> it's not even at 40% (i've made several intermediate files) at load
> 5. And this on a AMD dual opteron 246 with raid1 and 3GB of ram.
Load 5 suggests something's wrong, because dbi2omega and scriptindex
are both linear processes. Are you running several instances in
parallel in some way?
I believe that right now, none of the supplied Xapian indexing scripts
or binaries will go significantly above a load of 1, unless you have
other issues or something else happening on the machine.
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list