[Xapian-devel] Index Size comparison

Jaguar Xiong xiong.jaguar at gmail.com
Mon Apr 23 15:16:51 BST 2012


Hi,
I did a comparison based on similar steps as in the blog
(zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter),
against lucene-3.4 and xapian-1.3.0. The overall index sizes are:
lucene 89M, xapian 189M (chert backend and compacted).
Since I'm more interested in index size, I dig a little further to dump
the full term list. There are about 360000 terms from lucene index, and
about 285000 terms from xapian index. But surprisingly, the termlist.DB
of xapian index is already 122M.
Is there some idea/plan on reducing the index size? I'll glad if I could
help.

Thanks!
Jaguar



More information about the Xapian-devel mailing list