[Xapian-discuss] FLAG_WILDCARD, add_database and performance

Oliver Flimm flimm at ub.uni-koeln.de
Mon Aug 4 08:45:47 BST 2008


Hi,

On Mon, Aug 04, 2008 at 08:57:30AM +0200, Oliver Flimm wrote:
> > Could you profile to find where the time is spent?  Some tips are here:
> > 
> > http://trac.xapian.org/wiki/ProfilingXapian

it looks like some routines in libc get called alot when using a
wildcard search.

Here are the first lines of output for the search request 'java' which
took around 3.5 seconds:

CPU: Core 2, speed 2666.68 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        image name               app name
symbol name
-------------------------------------------------------------------------------
62107    75.5606  libperl.so.5.8.8         libperl.so.5.8.8
(no symbols)
  62107    100.000  libperl.so.5.8.8         libperl.so.5.8.8
(no symbols) [self]
-------------------------------------------------------------------------------
11012    13.3974  no-vmlinux               no-vmlinux
(no symbols)
  11012    100.000  no-vmlinux               no-vmlinux
(no symbols) [self]
-------------------------------------------------------------------------------
5883      7.1574  libc-2.3.6.so            libc-2.3.6.so
(no symbols)
  5883     100.000  libc-2.3.6.so            libc-2.3.6.so
(no symbols) [self]
-------------------------------------------------------------------------------
658       0.8005  oprofiled                oprofiled
(no symbols)
  658      100.000  oprofiled                oprofiled
(no symbols) [self]
-------------------------------------------------------------------------------
528       0.6424  libpthread-2.3.6.so      libpthread-2.3.6.so
pthread_getspecific
  528      100.000  libpthread-2.3.6.so      libpthread-2.3.6.so
pthread_getspecific [self]
-------------------------------------------------------------------------------
489       0.5949  Util.so                  Util.so
(no symbols)
  489      100.000  Util.so                  Util.so
(no symbols) [self]
-------------------------------------------------------------------------------
310       0.3772  libxapian.so.15.5.1      libxapian.so.15.5.1
(no symbols)
  310      100.000  libxapian.so.15.5.1      libxapian.so.15.5.1
(no symbols) [self]
-------------------------------------------------------------------------------
202       0.2458  mysqld                   mysqld
(no symbols)
  202      100.000  mysqld                   mysqld
(no symbols) [self]
-------------------------------------------------------------------------------
188       0.2287  libstdc++.so.6.0.8       libstdc++.so.6.0.8
(no symbols)
  188      100.000  libstdc++.so.6.0.8       libstdc++.so.6.0.8
(no symbols) [self]
-------------------------------------------------------------------------------

And here are the first few lines for the search request 'java
program*' that took 206 seconds:

PU: Core 2, speed 2666.68 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        image name               app name
symbol name
-------------------------------------------------------------------------------
1710086  81.0939  libc-2.3.6.so            libc-2.3.6.so
(no symbols)
  1710086  100.000  libc-2.3.6.so            libc-2.3.6.so
(no symbols) [self]
-------------------------------------------------------------------------------
180897    8.5783  no-vmlinux               no-vmlinux
(no symbols)
  180897   100.000  no-vmlinux               no-vmlinux
(no symbols) [self]
-------------------------------------------------------------------------------
131454    6.2337  libxapian.so.15.5.1      libxapian.so.15.5.1
(no symbols)
  131454   100.000  libxapian.so.15.5.1      libxapian.so.15.5.1
(no symbols) [self]
-------------------------------------------------------------------------------
70516     3.3439  libperl.so.5.8.8         libperl.so.5.8.8
(no symbols)
  70516    100.000  libperl.so.5.8.8         libperl.so.5.8.8
(no symbols) [self]
-------------------------------------------------------------------------------
5854      0.2776  libstdc++.so.6.0.8       libstdc++.so.6.0.8
(no symbols)
  5854     100.000  libstdc++.so.6.0.8       libstdc++.so.6.0.8
(no symbols) [self]
-------------------------------------------------------------------------------
4304      0.2041  oprofiled                oprofiled
(no symbols)
  4304     100.000  oprofiled                oprofiled
(no symbols) [self]
-------------------------------------------------------------------------------
950       0.0450  mysqld                   mysqld
(no symbols)
  950      100.000  mysqld                   mysqld
(no symbols) [self]
-------------------------------------------------------------------------------

Both search request use the combined database.

Here are some request times compared to the number of selected databases:

141 databases - 206 seconds
104 databases - 86 seconds
69 databases - 16 seconds
41 databases - 4 seconds
8 databases - 0.29 seconds

Regards,

Oliver

-- 
Universitaet zu Koeln :: Universitaets- und Stadtbibliothek
IT-Dienste :: Abteilung Universitaetsgesamtkatalog
Universitaetsstr. 33 :: D-50931 Koeln
Tel.: +49 221 470-3330 :: Fax: +49 221 470-5166
flimm at ub.uni-koeln.de :: www.ub.uni-koeln.de



More information about the Xapian-discuss mailing list