[Xapian-discuss] xapian performance

Olly Betts olly at survex.com
Wed Nov 15 18:00:26 GMT 2006


On Tue, Nov 14, 2006 at 08:33:39AM +0100, Arjen van der Meijden wrote:
> On 14-11-2006 3:34 Olly Betts wrote:
> >There's also still plenty of scope for improving code to speed up
> >indexing, and some scope for faster searching too.  I know some places
> >where better algorithms can be used, but I suspect there are bottlenecks
> >in unobvious places too.  Profiling to identify these places would be
> >a useful activity.
> 
> How hard would that be? I.e. what tools would we need to supply good 
> profiling samples?

The tricky part is that it's the interaction with I/O which usually
matters.  So any profiling technique which is very invasive could
be a problem as the effects of I/O will tend to be underestimated.

I think the best tool for this (at least for Linux) is probably
oprofile, since it's about as non-invasive as it gets.

We did try this before, but I couldn't manage to get the tarball-ed
oprofile information you sent me to process.  I think it's simpler
if the profile is processed before being sent.

So I think the best way to do this would be to take a particularly slow
real world example, and run it with oprofile.  Then process to produce
a callgraph (with opreport --callgraph) which should show where we wait
on I/O as part of the cumulative time spent in a function.

Hopefully that should provide sufficient insight to see where the time
is going.

> I think we could compile a profiling-enabled 
> xapian-library/omega-binary on our production-machine to generate some 
> codeprofiles. Or we could copy the database to a less beafy machine, but 
> that would probably skew the results towards more I/O-based than it (in 
> our environment) really is.

An issue with oprofile is it profiles the whole system, so running on a
production machine isn't great.  A skew towards being more I/O bound is
probably not a problem - I think that's where the slow cases tend to be.

Cheers,
    Olly



More information about the Xapian-discuss mailing list