[Xapian-discuss] xapian indexing size?

John Paige paige.john at gmail.com
Thu May 5 18:54:18 BST 2005


Yes, I was expecting that to be smaller than the corpus size. 
There are application tools like Glimpse, swish-e for example creates
an index that is much smaller than the corpse size (between 10 - 25%
of the corpus size).

Thanks,
John

On 5/5/05, rm at fabula.de <rm at fabula.de> wrote:
> On Thu, May 05, 2005 at 01:39:20PM -0400, John Paige wrote:
> > Hi,
> >    I am evaluating to use xapian in our product. I just downloaded the
> > core and examples code from the website.
> > I'm puzzeled about one thing though,  when I used the test program
> > "simpleIndexer", I found out that the index size is four times the
> > size of the corpus. I indexed 4MB worth of text files, and the index
> > was 16MB to index, and even after compaction, it still consumed 10MB.
> > when I added additional 4MB of text files, the original index went to 32MB.
> >
> > The index size is four times the size of the corpus, it doesn't seem
> > right. Am I doing something wrong?
> 
> Most likely not - but tell us what you _expect_ the index size to be?
> Do you expect the index size to be _smaller_ than the corpus?
> 
>  Cheers Ralf Mattes
> 
> > Thanks,
> > John
> >
> > _______________________________________________
> > Xapian-discuss mailing list
> > Xapian-discuss at lists.xapian.org
> > http://lists.xapian.org/mailman/listinfo/xapian-discuss
>



More information about the Xapian-discuss mailing list