[Xapian-discuss] Size of spelling database ?

Fabrice Colin fabrice.colin at gmail.com
Fri Nov 2 15:37:42 GMT 2007


On 11/2/07, Richard Boulton <richard at lemurconsulting.com> wrote:
> Fabrice Colin wrote:
> > Hi all,
> >
> > Can one estimate how big the spelling database would be, based
> > on the total index size and/or the size of the postlist table ?
> >
> >>From my own experience, the size of the spelling database is
> > roughly similar to the postlist table. That's with indexing documents
> > with the TermGenerator, and stemming enabled for most documents.
> > Does this sound about right ?
>
> In my experience, the spelling table is a lot smaller than the postlist.
> For example, I have one database for which the spelling table is 16Mb
> and the postlist is 576Mb.
>
> However, for a small database, they could well be similar.  I'd expect
> the postlist database to grow roughly with the number of documents in
> the database, and the spelling database to grow more like the number of
> terms.  Assuming there are lots of documents sharing the same terms, the
> spelling database should therefore grow a lot more slowly.
>
> What are your current actual sizes?
>
I have one 192Mb index here with a 51Mb postlist and a 98Mb spelling
database. Another index is 597Mb big, with a 159Mb postlist.DB and a
275Mb spelling.DB.
I also got a report about a 1,3Gb index with a 408Mb postlist.DB and a
622Mb spelling.DB.

Should I be worried ? ;-)

Fabrice



More information about the Xapian-discuss mailing list