[Xapian-discuss] xapian performance
Olly Betts
olly at survex.com
Wed Nov 15 18:00:26 GMT 2006
On Tue, Nov 14, 2006 at 08:33:39AM +0100, Arjen van der Meijden wrote:
> On 14-11-2006 3:34 Olly Betts wrote:
> >There's also still plenty of scope for improving code to speed up
> >indexing, and some scope for faster searching too. I know some places
> >where better algorithms can be used, but I suspect there are bottlenecks
> >in unobvious places too. Profiling to identify these places would be
> >a useful activity.
>
> How hard would that be? I.e. what tools would we need to supply good
> profiling samples?
The tricky part is that it's the interaction with I/O which usually
matters. So any profiling technique which is very invasive could
be a problem as the effects of I/O will tend to be underestimated.
I think the best tool for this (at least for Linux) is probably
oprofile, since it's about as non-invasive as it gets.
We did try this before, but I couldn't manage to get the tarball-ed
oprofile information you sent me to process. I think it's simpler
if the profile is processed before being sent.
So I think the best way to do this would be to take a particularly slow
real world example, and run it with oprofile. Then process to produce
a callgraph (with opreport --callgraph) which should show where we wait
on I/O as part of the cumulative time spent in a function.
Hopefully that should provide sufficient insight to see where the time
is going.
> I think we could compile a profiling-enabled
> xapian-library/omega-binary on our production-machine to generate some
> codeprofiles. Or we could copy the database to a less beafy machine, but
> that would probably skew the results towards more I/O-based than it (in
> our environment) really is.
An issue with oprofile is it profiles the whole system, so running on a
production machine isn't great. A skew towards being more I/O bound is
probably not a problem - I think that's where the slow cases tend to be.
Cheers,
Olly
More information about the Xapian-discuss
mailing list