[Xapian-discuss] Improving indexing speed
James Aylett
james-xapian at tartarus.org
Wed Jul 2 01:49:28 BST 2008
On Tue, Jul 01, 2008 at 02:47:12PM -0700, Robert Kaye wrote:
> > Yes, it's always interesting to hear performance reports.
>
> Ok, I've tinkered with the setup a bit. I've found that if I give
> xapian loads and loads of RAM, it doesn't even get around to using all
> the RAM I give it -- at most each process used 5% of 8G of RAM.
>
> I measured disk access with:
>
> iostat -x 10 (10 second disk usage average window)
>
> And CPU util with top. I've found:
>
> 3 processes: 95% - 96% CPU usage for each process, 40%-60% disk usage
> 4 processes: 95% - 96% CPU usage for each process, 60%-90% disk usage
> 5 processes: 92% - 94% CPU usage for each process, 80%-100% disk usage
> 6 processes: 91% - 93% CPU usage for each process, 100% disk usage
> sustained
>
> It looks like 4 processes is the sweet spot that doesn't utterly slam
> the machine. This is much better than I had anticipated -- well done
> Xapian team!
Couple of detail questions:
* what processor?
* what OS?
* how many spindles behind the FS volume?
* what hard disks?
All hard data is good data, but obviously it's even better if there's
context as well -- apologies if you're already given any of these
details, but I didn't notice them recently in the thread.
(By the way, slamming the disks during index is what you want to do
unless you're also searching off the same database. A breakdown of the
type of CPU usage will help analysis here -- iowait versus sys/user
will tell you when you're starting to become IO bound. 4-6 processes
to max out your storage is pretty good :-)
J
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james at tartarus.org uncertaintydivision.org
More information about the Xapian-discuss
mailing list