[Xapian-devel] Tarball size (was Re: [Xapian-commits] 7916: trunk/xapian-core/ trunk/xapian-core/docs/)

Olly Betts olly at survex.com
Thu Mar 8 05:29:55 GMT 2007


On Wed, Mar 07, 2007 at 11:22:36PM +0000, olly wrote:
> docs/Makefile.am: Stop shipping docs/apidoc/latex/* in the
> xapian-core tarballs since it's just useless bloat.  Removing it
> more than halves the size of the tarball (55% reduction!)

Hmm, this is very odd.

I tried unpacking the last xapian-core snapshot tarball, deleting the
junk files, and rebuilding the tarball, and I got a 55% smaller one,
which is where that figure came from.

But the new snapshots are only 9.3% smaller (still a nice gain but not
anywhere near 55%).  So I investigated a bit.

If I take a xapian-core snapshot (new or old) and just un-tar and re-tar
the *same* files, it roughly halves in size.  It doesn't look like "make
dist" does anything stupid, and if I reproduce the options it seems to
be using, I still get similar reductions in size.

I tried a few other autotools generated tarballs (including xapian-omega
and xapian-bindings snapshots) and otherwise I seem to get a very
similar size tarball.

If I do "tar ztvf" on each and diff, 3 files appear at different places
in the two lists, all files in apidoc/html.  So I guess that must be why
one compresses so much better, but it seems rather an extreme difference.

I'm guessed it might be the "dist_hook" rule which puts these files in
that caused this somehow, but changing that to just add them via a
directory listed in EXTRA_DIST doesn't change the size much.

Perhaps it's just that xapian-core is especially sensitive to file
ordering.  If so, I guess we need to add an unpack/repack stage between
generating tarballs and copying them to the website...

Cheers,
    Olly



More information about the Xapian-devel mailing list