"Theodore Ts'o" <ty...@mit.edu> writes: > On Sun, May 12, 2024 at 04:27:06PM +0200, Simon Josefsson wrote: >> Going into detail, you use 'gzip -9n' but I use git-archive defaults >> which is the same as -n aka --no-name. I agree adding -9 aka --best is >> an improvement. Gnulib's maint.mk also add --rsyncable, would you agree >> that this is also an improvement? > > I'm not convinced --rsyncable is an improvement. It makes the > compressed object slightly larger, and in exchange, if the compressed > object changes slightly, it's possible that when you rsync the changed > file, it might be more efficient. But in the case of PGP signed > release tarballs, the file is constant; it's never going to change, > and even if there are slight changes between say, e2fsprogs v1.47.0 > and e2fsprogs v1.47.1, in practice, this is not something --rsyncable > can take advantage of, unless you manually copy > e2fsprogs-v1.47.0.tar.gz to e2fsprogs-v1.47.1.tar.bz, and then rsync > e2fsprogs-v1.471.tar.g.... and I don't think anyone is doing this, > either automatically or manually. > > That being said, --rsyncable is mostly harmless, so I don't have > strong feelings about changing it to add or remove in someone's > release workflow.
Your example had me convinced, and I thought some more about why we really should keep using it as it consumes a small percentage more CPU and disk space. I have realized that another common operation is storing and transfering _several_ different releases of e2fsprogs. I would suspect that most releases of software is fairly similar to the previous release when uncompressed. With gzip --rsyncable, the tarballs should then be mostly similar. Without --rsyncable, they will largely be different if I understand correctly. This affects dedup-able storage and transfer methods, and some anecdotical evidence suggests this improvement is significant - going from 215GB to 176GB vs 13GB: https://gitlab.archlinux.org/archlinux/infrastructure/-/merge_requests/429 Maybe someone could do some experiment to see if there is substance to this argument, its not clear to me that the example is comparable. Storing/transferring several releases for the same software could add significant savings for larger set of archives. As the downside seems fairly small, and the potential upside may be significant, I will use and recommend --rsyncable for git-archive release tarballs: git archive --format=tar --prefix=$PACKAGE-$VERSION/ HEAD | \ env GZIP= gzip --no-name --best --rsyncable \ > $PACKAGE-$VERSION-src.tar.gz /Simon
signature.asc
Description: PGP signature