[Xapian-discuss] Ideas for faster DB searching

Olly Betts olly at survex.com
Fri Dec 2 00:45:56 GMT 2005


On Thu, Dec 01, 2005 at 04:29:42PM -0600, Tony Lambiris wrote:
> I finally came across that quest utility, and I have to say the 
> speed-ups are incredible... why would omega have that much overhead?

Did you try disabling "topterms"?  Do so by removing the $topterms bits
of the default query template.

The topterms can take a while to compute - it seems to be especially
a problem with multiple databases from what I've seen, but I've not
investigated why.

> We at work recently built an raid level 0 in an Apple Xraid, have it 
> mounted under Linux 2.6 with noatime options...

I doubt noatime makes much difference, as there aren't a lot of files
involved (unlike INN or similar), but it can't hurt.  If you have
figures showing it makes a measurable difference, let me know and I'll
add it as general advice.

> is there any other general recommendations for speeding up Xapian DB
> searching? Is there anything I can store in memory for faster lookups?

We leave disk caching to the OS, since it has a much better overview of
the VM system.  This also simplifies the library code.  This approach
seems to work very well on modern OSes.  Linux in particular is
generally very eager to cache disk blocks, sometimes to the extent of
swapping out code you didn't want swapped out I've occasionally found.

It's possible you can tune your OS's VM system for this - I know Linux
2.6 has some knobs, but I've not tried using them myself.

> Also we have a bunch of seperate DBs (one per day, each one ~100k
> documents)... is it recommended to just have one big DB?

It's probably going to be slightly quicker to search just one DB,
and one DB is likely to be smaller than the combined size of several,
so there's less DB to cache.

But if you prefer to keep multiple DBs, run them through quartzcompact
(or xapian-compact for Flint DBs) once the day is over and you'll get a
smaller DB which searches faster.

If you want just one DB, you can use quartzcompact/xapian-compact to
merge and compact several DBs into one.

Cheers,
    Olly



More information about the Xapian-discuss mailing list