On Sun, Aug 20, 2017 at 01:22:52AM +0200, Marco d'Itri wrote:
> On Aug 20, Theodore Ts'o <ty...@mit.edu> wrote:
> 
> > By the way, I was just taking a quick look, and e2fsprogs isn't the
> > only offender in this regard.  Out of a 201 MB i386 minbase chroot, 33
> > MB, or over 16% can be found in /usr/share/locale.  The next largest
> > hierarchies under /usr/share are /usr/share/doc, at 9.4 MB, and
> > /usr/share/man, at 6.1 MB.
> > 
> > So if the goal is to shrink minbase, that might be a really good place
> > to start.  Packages that might be good initial first targets would
> > probably be coreutils, dpkg, and gnupg.
>
> I am not persuaded that we should split a significant number of packages 
> (how many? Where do you draw the line?) this way: we already have a tool 
> to solve this in a general way, i.e. localepurge.

>From the localepurge package description:

   This tool is a hack which is *not* integrated with the system's
   package management system and therefore is not for the faint of heart.
   Its interference can provoke strange, but usually harmless, behavior in
   programs related to apt/dpkg, such as dpkg-repack, reportbug, etc.

In either case, it seems unlikely people would be happy doing using
either localepurge or dpkg --path-exclude for official Docker images.
Splitting out the locale files is going to be much easier to support
than trying to hammer localremove into debootstrap.

Furthermore, it's not a significant number of packages, simply because
there aren't a huge number of packages in minbase that have locale
files to begin with, and only a handful of those have significantly
sized locale files.

Here's the breakdown of all of the packages with locale files in the
minbase set in Debian Jessie, the savings (in kilobytes of installed
size) if we were to split out their locale files, and the cummulative
savings if we split the top N packages.

Package         Savings Cummulative
                        Savings  Percentage
coreutils       8052     8052    24.91
dpkg            4620    12672    39.20
bash            3744    16416    50.78
gnupg           3424    19840    61.37
e2fsprogs       1776    21616    66.86
tar             1680    23296    72.06
shadow          1632    24928    77.11
apt             1528    26456    81.84
libapt-pkg4.12  1052    27508    85.09
Linux-PAM       796     28304    87.55
findutils       756     29060    89.89
grep            636     29696    91.86
diffutils       620     30316    93.78
debconf         596     30912    95.62
adduser         444     31356    96.99
sed             428     31784    98.32
libgpg-error    388     32172    99.52
systemd         84      32256    99.78
acl             72      32328   100.00

I wouldn't call 12 packages a "significant number", and that would get
us over 90% of the 32 megabyte savings.  Splitting the top six would
get us 72% of the savings.  And with *zero* chance of compatibility
problems.

In addition, most of these packages, including the the top three
packages on the above list are ones that we could **never** remove
from the essential=yes / priority=required set.

Finally, splitting out the locale files for coreutils alone would net
almost three times the savings of completely removing e2fsprogs (2809k
in Jessie) from the minbase set.  If you think it's worth it to work
on removing e2fsprogs from minbase, why not split out at least
coreutils, dpkg, bash, and gnupg?  It will be less work, and will
result in more disk space saved.

                                        - Ted

Reply via email to