[Xapian-discuss] Search performance issues and profiling/debugging search?

Richard Boulton richard at lemurconsulting.com
Tue Oct 23 22:25:47 BST 2007


Ron Kass wrote:
> * Estimates vary, although its exactly the same search done right one 
> after the other with no changes to the DB (no data added). This is not 
> really a big issue.

This is the issue which looks oddest to me, however (though there are 
other oddities).  If I understood you correctly, each search is 
performed on the same Database object, without reopening the Database 
object between each search.  This should result in exactly the same 
results (and estimates) for each repeated search, since a Database 
object accesses a snapshot of the underlying database.  Since this isn't 
happening, there must be something we don't understand about your setup, 
and this is the first thing to resolve.

Are you using the network (ie, "remote") backend?  If so, the problem 
could be that some of the database connections are timing out occasionally.

Is the database mounted locally, or over a network filing system?

If the database isn't being modified while searches are in progress, 
and/or Database objects aren't being reopened between searches, 
something very weird is happening.  Try doing the search on just a 
single database, and see if you get the same effects.  If not, add in 
databases gradually until you do.


Other than that, it would be useful to separate timings for the search 
(ie, the get_mset() call) from the calls which fetch the document data, 
to see where most of the time is being spent.


 > 4. Anyone has ideas how we can profile things to see why we have such
 > performance issues? We would be happy to dig deeper to find out where
 > the problems arise from.

oprofile is your best bet for seeing where the time is being spent.

-- 
Richard



More information about the Xapian-discuss mailing list