On Sun, Aug 20, 2017 at 01:22:52AM +0200, Marco d'Itri wrote: > On Aug 20, Theodore Ts'o <ty...@mit.edu> wrote: > > > By the way, I was just taking a quick look, and e2fsprogs isn't the > > only offender in this regard. Out of a 201 MB i386 minbase chroot, 33 > > MB, or over 16% can be found in /usr/share/locale. The next largest > > hierarchies under /usr/share are /usr/share/doc, at 9.4 MB, and > > /usr/share/man, at 6.1 MB. > > > > So if the goal is to shrink minbase, that might be a really good place > > to start. Packages that might be good initial first targets would > > probably be coreutils, dpkg, and gnupg. > > I am not persuaded that we should split a significant number of packages > (how many? Where do you draw the line?) this way: we already have a tool > to solve this in a general way, i.e. localepurge.
>From the localepurge package description: This tool is a hack which is *not* integrated with the system's package management system and therefore is not for the faint of heart. Its interference can provoke strange, but usually harmless, behavior in programs related to apt/dpkg, such as dpkg-repack, reportbug, etc. In either case, it seems unlikely people would be happy doing using either localepurge or dpkg --path-exclude for official Docker images. Splitting out the locale files is going to be much easier to support than trying to hammer localremove into debootstrap. Furthermore, it's not a significant number of packages, simply because there aren't a huge number of packages in minbase that have locale files to begin with, and only a handful of those have significantly sized locale files. Here's the breakdown of all of the packages with locale files in the minbase set in Debian Jessie, the savings (in kilobytes of installed size) if we were to split out their locale files, and the cummulative savings if we split the top N packages. Package Savings Cummulative Savings Percentage coreutils 8052 8052 24.91 dpkg 4620 12672 39.20 bash 3744 16416 50.78 gnupg 3424 19840 61.37 e2fsprogs 1776 21616 66.86 tar 1680 23296 72.06 shadow 1632 24928 77.11 apt 1528 26456 81.84 libapt-pkg4.12 1052 27508 85.09 Linux-PAM 796 28304 87.55 findutils 756 29060 89.89 grep 636 29696 91.86 diffutils 620 30316 93.78 debconf 596 30912 95.62 adduser 444 31356 96.99 sed 428 31784 98.32 libgpg-error 388 32172 99.52 systemd 84 32256 99.78 acl 72 32328 100.00 I wouldn't call 12 packages a "significant number", and that would get us over 90% of the 32 megabyte savings. Splitting the top six would get us 72% of the savings. And with *zero* chance of compatibility problems. In addition, most of these packages, including the the top three packages on the above list are ones that we could **never** remove from the essential=yes / priority=required set. Finally, splitting out the locale files for coreutils alone would net almost three times the savings of completely removing e2fsprogs (2809k in Jessie) from the minbase set. If you think it's worth it to work on removing e2fsprogs from minbase, why not split out at least coreutils, dpkg, bash, and gnupg? It will be less work, and will result in more disk space saved. - Ted