reopen 775163 quit On Mon, Jan 12, 2015 at 10:34:04AM +0100, David Kalnischkies wrote: > On Mon, Jan 12, 2015 at 09:36:00AM +0100, Michael Vogt wrote: > > On Sun, Jan 11, 2015 at 09:40:20PM -0800, Elliott Mitchell wrote: > > > I've ended up examining how much space programs are using in /var, and > > > APT is the top pig, using close to half of /var as /var/lib/apt/lists, > > > one factor does appear to be exasperating this, `dpkg` has 5 foreign > > > architectures. > > 5 architectures? That is a lot??? I presume you are a heavy cross builder?
I dunno, depends on what one considers "a lot" and how much cross-building it takes to qualify as a heavy cross-builder. I will note several of the 5 are subarchitectures, not wholly separate architectures. > If you have repositories you don't want to get the data for a specific > architecture, consider the sources.list [arch-=]-syntax: see manpage. Useful to know, but this doesn't have much impact. The lists files for pure Debian testing and stable are by far the biggest users in /var/lib/apt. > > > Trying a few compression methods: > > > > > > 426248 lists > > > 114580 lists.gz > > > 90868 lists.bz2 > > > 85648 lists.lzma > > > 86532 lists.xz > > > > > > Nearly all of this space is being used for the Packages files. Merely > > > compressing them would be a rather major improvement. The main Debian > > > testing file is the biggest of these. > > > > You can use the configuration option > > """ > > Acquire::GzipIndexes "1"; > > """ > > to keep the indexes compressed on disk. You trade the speed for > > building the mmap cache with the size of the data on disk. > > It isn't just startup, other parts of runtime will access it to, so it > can/will be slow all around. Searching for example, but also any action > downloading a package (because of the Filename: field, among others). > It is the "price" you pay for Debian having such a huge archive. You can > freely delete the /var/cache/apt/lists directory through if you are done > working with apt for the moment. This is usually done on space > constraint embedded systems for example. Just remember to do a 'apt-get > update' before you use apt the next time and apt will recreate the > directory and its content. > > (Note btw that the option mentioned above keeps the compressed files it > downloaded, so it isn't compressing the entire directory, which means > less savings - note also that some compression algorithms are more > cpu/memory/time hungry than others while they are uncompressing.) Given how often `apt-get update ; apt-get upgrade` *should* be run (weekly for workstations, perhaps monthly for embedded systems), nuking the lists files is a distinctly losing proposition. Depending upon what operations are most used, an index into the compressed files could recover most of the speed. I also tried the above option before submitting the bug report, the space savings were trivial. > > Note that this option works best with later apt versions (1.0.9.2 or > > later) where this option supports all compressions that apt supports > > (the older versions only support gzip). > > As there isn't much else we can do about it, I am closing with that > version number. There are other things that could be done which would result in major reductions in APT's usage of space in /var. Looking at /var/lib/apt/lists: The <repo>_dists_<dist>_<component>_binary-<arch>_Packages files have a *lot* redundancy amoung them. While the Filename, Size, and checksums fields will differ between architectures, nearly all other fields won't vary between architectures. Making a common <repo>_dists_<dist>_<component>_common_Shared file and having the binary-<arch> files merely contain the fields that differ from the common file would present large savings for people who have support for foreign architectures present (at least half per file, likely closer to two-thirds). Some additional savings could be had by taking advantage of redundancies in the source_Sources files. I also notice pkgcache.bin and srcpkgcache.bin in /var/cache/apt. These two files appear to derive from other files, does keeping these on persistent storage really speed up any operations? (I didn't notice any difference after deleting them) -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sig...@m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org