[Xapian-discuss] xapian vs lucene.net

Olly Betts olly at survex.com
Wed Aug 30 22:59:07 BST 2006


On Wed, Aug 30, 2006 at 12:16:20PM -0800, Oscar Usifer wrote:
> http://www.cdlib.org/inside/projects/xtf/Search_Engine_Comparison.pdf

I've read this comparison before - it's kind of shallow, with a number
of inaccuracies (at least for Xapian - I don't really know enough about
most of the other software to say).  It also appears to be written to
guide the choice of software for a particular use by a particular
organisation - for example, "Java" is listed a factor in favour of
Lucene, but (language religion aside) being written in Java isn't an
advantage for every potential user.

It fails to include any source code or test data, or even give much
information at all on how tests were performed (I can't even seem to
see a spec for the test machine), so it's impossible to actually repeat
the tests.  This also means it's hard to know if they're actually
comparing like-for-like.

The report also claims Xapian leaks memory while indexing, which just
isn't the case.  We've run the testsuite under valgrind for years and
there are no memory leaks reported.  I also don't see unbounded growth
in memory usage when indexing gmane.  We actually do relatively little
explicit allocation and deallocation of memory.

And apparently Xapian is developed by an "informal closed group" of
"former employees of BrightStation PLC" which was news to me.  Richard
and I both used to work for BrightStation PLC, but nobody else involved
now did, and it's certainly not a closed group.

A lot of the report is very out-of-date now too (it's dated January
2004) - e.g.  it says Xapian doesn't have a Java API, but Eric B. Ridge
contributed one in April 2004.  The same April 2004 release (0.8.0) also
reworked how updates are batched and applied so the indexing times and
memory usage will be far smaller now.  The database size figures will be
way off too compared to flint (or even the latest version of quartz).

Cheers,
    Olly



More information about the Xapian-discuss mailing list