Hi,

I had been working on the cruft/cruft-ng package since 2014;
there where a few setbacks along the years,
like mlocate -> plocate & UsrMerge transitions,
but it's alive and kicking, helping to find random
lost files left behind by other packages
and file bugs against those from time to time
to get these glitches resolved.



Recently I've been working a lot on it because I realized
it would be the perfect solution to audit the disk space
usage problems I'm facing at work.

So I somewhat whipped up what I remembered from my own proposal
https://wiki.debian.org/Cruft/purge and have now for myself a working
"dh-cruft" than I can use to register dynamic files
owned by some private .deb. Here "dh-cruft" is a must, I don't want to
polute Debian with some random external data from downstream.

This DebHelper works this way:
* the "debian/cruft" list merely register the glob patterns,
* and "debian/purge" list also an "rm -rf" stanza in postrm/purge.

As a bonus there's now also a new "cpigs" command, working akin to
"dpigs" from Debian Goodies to list the biggest volatile data producers.


The plan now is to have a new option that dumps the whole
matching result database as .json with individual file size
for jq consumption or in my case Jupyter;
this instead of implementing older requests (#291823 #487458 #527285).


I know it's a very old unresolved subject that has been lurking forever
here, but maybe it's the right time to look it up with a fresh view.

My proposal for next steps:ยต
  * gather your comments here
  * some review of dh-cruft (I don't know Perl)
  * get it in the NEW queue soon
  * have interested packages take part;
    for now cruft-ng ship it's own homegrown fallback database
  * (later): merge dh-cruft into DebHelper when it's basically "done"
  * (much much later): migrate some logic from DH to dpkg itself,
    with a more declarative packaging style;
    cruft-ng is already linked with the static library libdpkg
    and is bound to progress at the same pace.

  * there is still a performance problem in cruft-ng that I wish to improve.
    Basic profiling can be done by setting ELAPSED=1 env var.

Greetings,

Alexandre Detiste


./cpigs 30
496720816 apt
68957680 npm
61846660 linux-image-5.19.0-1-amd64 (the initrd)
61787431 linux-image-5.19.0-2-amd64
53131401 dlocate
36229735 aptitude
19621198 dpkg
17896745 plocate
13559874 jupyter-nbextension-jupyter-js-widgets
11982526 udev
11870208 openjdk-11-jre-headless
7257544 debconf
5704857 smartmontools
5685370 ttf-mscorefonts-installer
5086033 linux-image-5.18.0-4-amd64 -> rc state
4933502 grub-common
3550208 qgis
3523931 fontconfig
3421312 ucf
3231839 shared-mime-info
3063016 locales
2266947 libreoffice-common   (files seen from explain/ucf)
1901483 grub-pc-bin
1565651 logrotate
1258042 man-db
1107968 ALTERNATIVES         (I thought these were only symlinks ?)
783313 popularity-contest
763776 unattended-upgrades   (du -b /var/log/unattended-upgrades/760422)
657496 breeze-icon-theme
625345 PYEXCEL            (some pip3 automation)

Attachment: pgplDWk0_S4Hw.pgp
Description: OpenPGP digital signature

Reply via email to