[Xapian-discuss] Flint failed to deliver indexing performance to Quartz.

Olly Betts olly at survex.com
Mon Jun 18 05:09:15 BST 2007


On Sat, Jun 16, 2007 at 11:50:58PM -0700, Kevin Duraj wrote:
> I am proposing to remove Flint as default database and place Quartz
> database back as default.

Again, please be realistic.

> The catch is not that Flint database is
> smaller and faster during searches then Quartz database as developers
> were concerning when were measuring and neglecting to measure
> performance when creating the large indexes.

Please stop spreading FUD.  This simply isn't true.  I looked at
indexing performance, search performance, and index size during
development.

> The truth is that Flint database can not scale beyon 5 million
> documents to index in reasonable time. High disk activities has been
> reported when indexing using Flint, server is seizing not able to
> write to Hard Disk compare when Quartz database is used to index.

You are the only person to report this, but rather than help us to
address this by investigating why, you just keep telling us about it
and then suggesting we "fix" it by throwing away months of useful
work.

> Flint show to be 10-16 times slower during indexing 10 million of
> documents on 4 CPU 16GB memory servers.

I don't have such a server to test on, and I don't have your data
sets to test with, so you're going to need to do some detective work
as to why this might be, or show me how to demonstrate similar problems
with data sets I have access to on machines I have access to.

> Flint so far absolutely failed to deliver nearly fractionally the
> performance that Quartz database has been achieving during high
> quantity documents indexing in short time using plenty of memory.

... in your application.  It seems to work very well for others.

> Example of my benchmarks:
> 
> Quartz database index 10 million of unique documents with set
> XAPIAN_FLUSH_THRESHOLD=10000000 in less then 1 hour.
> 
> Flint database index 10 million of unique documents with set
> XAPIAN_FLUSH_THRESHOLD=10000000 in less then 16 hours.

A useful benchmark needs to include sufficient information that it can
be reproduced.  This isn't a useful benchmark, since there's no way I
can reproduce it for myself.

> Please provide settings to remove Flint and add Quartz as default
> database.

If you really must, that already exists:

./configure --disable-backend-flint

But it's tantamout to burying your head in the sand.

> Unless the unacceptable indexing performance using Flint
> database will be resolved.

Since only you can see it, it will only be resolved if you help us
to resolve it!

> Do not even think about to removing support for Quartz database from
> Xapian.

Quartz is scheduled for removal in Xapian 1.1.0.  We don't have the
resources to maintain multiple generations of backends in parallel,
but if you really want it to stay, you could offer to maintain the code.  

However, it would almost certainly be easier to help us work out why
flint isn't working as well for you.

Cheers,
    Olly



More information about the Xapian-discuss mailing list