[Xapian-discuss] Many problems with Xapian 1.x let's roll back to Xapian to 0.9.x

Olly Betts olly at survex.com
Mon Jun 18 04:39:55 BST 2007


On Sat, Jun 16, 2007 at 09:46:06PM -0700, Kevin Duraj wrote:
> There are so many problems with Xapian 1.0.x that I must propose to
> roll back to the last good working version of Xapian 0.9.x

Please be realistic.  You're not going to get us to throw away many,
many fixes and improvements because you're having some teething trouble
with the new release.

You're welcome to continue to use 0.9.x if it makes you happier, but
the more sensible approach would be to help us work out what is causing
the slowdowns you're seeing so we can address them.

> What used to take 50 minutes to index now will not index in several
> hours.

So you keep saying, but you've yet to offer us any insight as to why!

Over the weekend, I've been trying out reindexing gmane using 1.0.x
(xapian-core-1.0.1_svn8931 to be precise, but that's essentially
just 1.0.1 plus a new lazy table creation feature which avoids
creating the value and/or position tables if they aren't used).

It's indexed 6.5 million so far, and the indexing rate is a little less
than half what it was on the last rebuild (which used 0.9.9 flint).
However, before I was indexing only unstemmed forms, whereas now I'm
indexing both stemmed and unstemmed - this means that the number of term
postings will have almost doubled and so I'd expect the rate to almost
halve just because of that.

I did try out running the start of the reindex with the old indexing
strategy but the new Xapian before I started the full reindex - this
showed it was a little slower than before, but a few percent slower
not several times slower.

In short, I can't reproduce what you describe with the little information
you've provided.  Also, nobody else has reported such issues, and I've
heard reports that 1.0.1 is faster at indexing for some people
(Jean-Francois Dockes reports that it makes Recoll index nearly twice as
fast compare to 0.9 quartz).

So if you want this to be addressed, you're going to have to analyse
what's going on in your case.

I suggested before that you should try increasing COMPRESS_MIN in
backends/flint/flint_table.cc to see what effect different values
have.  Have you tried that?

It's currently 4, but what happens if it's 100?  If that makes no
difference, go higher; if that makes a huge difference, try a value in
between.

Cheers,
    Olly



More information about the Xapian-discuss mailing list