[Xapian-discuss] Update under a large database is slow

Olly Betts olly at survex.com
Wed Sep 5 03:08:32 BST 2007


On Thu, Jul 12, 2007 at 10:18:07AM +0800, Gea-Suan Lin wrote:
> We use Perl module Search::Xapian 1.0.2.0 to index ~4m articles (it's
> 26GB right now), but updating is slow. (about 4 article/sec with I/O
> bound)

What spec is the machine?

Are you setting XAPIAN_FLUSH_THRESHOLD?

> The articles are UTF-8 CJK, we use bigram to generate terms, so it's
> very easy to generate ~10k terms for a mid-size article. The article
> itself is not stored in Xapian, but only the terms.

That is a lot more terms than is typical, so I'd expect indexing to be
slower, but 4 per second is very slow.

Cheers,
    Olly



More information about the Xapian-discuss mailing list