Package: lintian Version: 2.5.13 Severity: minor Hi,
I have spent a little time looking at the memory consumption of Lintian. I have already optimized a few trivial cases, but I have also seen futher potential that may be worth deploying. Method: I have only looked at the memory consumed by L::Collect at the end of the run (i.e. the memory data you get with -dddd from the master branch). I suspect there is also a lot to be gained from not slurping copyright files (etc.) without any regards to their size in checks. Observations so far: - (sorted_)index is(/are) the primary memory consumer(s). - fortunately, they share a large part of the memory so the output of "-dddd" looks worse than it is. - AFAICT I can tell, memory seems to be "leaked" by Perl allocating large buffers for the strings[1]. With my test case source:linux, I suspect we can save about 5MB if we can reduce the size of these buffers to problem. - changelog is also pretty expensive and could probably very often be shared between different L::Collect instances related to the same processable group. For lintian, it is about 2Mb for both the .deb and the source. The common "non-sharing" case would be binNMUs or a bug in the package[2]. - If we can dedup this on a disk level, we are also recovering disk storage. I haven't checked if this is worth it even for lintian.d.o. The savings will be a lot less on the disk though, since Perl/Parse::DebianChangelog spends more memory on it than its actual size (about factor 4 for lintian's changelog)[3]. I have attached the memory usage information of running lintian with -dddd (-X man) on some of the linux binaries and their source package, which can be used as reference. ~Niels [1] This makes sense if the strings are to be changed later, but generally this strings will only be read. [2] E.g. installing the wrong changelog in one of the binary packages. :) [3] If we are doing this disk de-duplication, we can probably trivally apply it to copyright files as well. But back to memory ...
N: Memory usage [source:linux/3.2.20-1]: 35.91 MB N: -- base_dir: 120.00 B N: -- binaries: 105.41 kB N: -- binary_field: 1022.11 kB N: -- binary_relation: 1513.17 kB N: -- changelog: 1440.98 kB N: -- debfiles: 116.00 B N: -- diffstat: 116.00 B N: -- field: 109.57 kB N: -- file_info: 3.92 MB N: -- index: 29.32 MB N: -- is_non_free: 116.00 B N: -- name: 64.00 B N: -- native: 20.00 B N: -- relation: 12.07 kB N: -- relation_noarch: 5.85 kB N: -- sorted_index: 28.74 MB N: -- source_field: 1.71 kB N: -- type: 64.00 B N: -- unpacked: 116.00 B N: Memory usage [binary:linux-source-3.2/3.2.20-1/all]: 1462.71 kB N: -- base_dir: 136.00 B N: -- changelog: 1440.99 kB N: -- control: 144.00 B N: -- control-index: 2.38 kB N: -- field: 2.00 kB N: -- file_info: 1249.00 B N: -- index: 9.39 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 904.00 B N: -- name: 72.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 4.98 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 8.42 kB N: -- type: 64.00 B N: Memory usage [binary:linux-support-3.2.0-2/3.2.20-1/all]: 1476.09 kB N: -- base_dir: 140.00 B N: -- changelog: 1440.99 kB N: -- control: 148.00 B N: -- control-index: 3.66 kB N: -- field: 1473.00 B N: -- file_info: 3.01 kB N: -- index: 22.93 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 2.45 kB N: -- name: 76.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1.59 kB N: -- scripts: 625.00 B N: -- sorted_control-index: 2.78 kB N: -- sorted_index: 21.78 kB N: -- type: 64.00 B N: -- unpacked: 152.00 B N: Memory usage [binary:linux-doc-3.2/3.2.20-1/all]: 7.65 MB N: -- base_dir: 132.00 B N: -- changelog: 1440.99 kB N: -- control: 140.00 B N: -- control-index: 2.38 kB N: -- field: 1.57 kB N: -- file_info: 1016.66 kB N: -- index: 5.03 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 1150.69 kB N: -- name: 68.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1476.00 B N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 4.95 MB N: -- type: 64.00 B N: -- unpacked: 144.00 B N: Memory usage [binary:linux-image-3.2.0-2-amd64/3.2.20-1/amd64]: 9.28 MB N: -- base_dir: 148.00 B N: -- changelog: 1441.00 kB N: -- control: 156.00 B N: -- control-index: 5.65 kB N: -- field: 2.21 kB N: -- file_info: 893.27 kB N: -- index: 2.78 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 532.94 kB N: -- name: 80.00 B N: -- native: 64.00 B N: -- objdump_info: 4.35 MB N: -- relation: 7.02 kB N: -- scripts: 602.00 B N: -- sorted_control-index: 4.63 kB N: -- sorted_index: 2.74 MB N: -- strings: 156.00 B N: -- type: 64.00 B N: -- unpacked: 156.00 B N: Memory usage [binary:linux-manual-3.2/3.2.20-1/all]: 4.65 MB N: -- base_dir: 136.00 B N: -- changelog: 1440.99 kB N: -- control: 144.00 B N: -- control-index: 2.38 kB N: -- field: 1.96 kB N: -- file_info: 338.43 kB N: -- index: 2.70 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 566.65 kB N: -- name: 72.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 2.12 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 2.65 MB N: -- type: 64.00 B N: Memory usage [binary:linux-manual-3.2/3.2.35-2/all]: 4.75 MB N: -- base_dir: 136.00 B N: -- changelog: 1525.74 kB N: -- control: 144.00 B N: -- control-index: 2.38 kB N: -- field: 2.06 kB N: -- file_info: 339.34 kB N: -- index: 2.70 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 568.13 kB N: -- name: 72.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1.81 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 2.66 MB N: -- type: 64.00 B N: Memory usage [binary:linux-headers-3.2.0-4-amd64/3.2.35-2/amd64]: 6.86 MB N: -- base_dir: 148.00 B N: -- changelog: 1525.75 kB N: -- control: 156.00 B N: -- control-index: 3.02 kB N: -- field: 1.77 kB N: -- file_info: 691.72 kB N: -- index: 4.63 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 761.51 kB N: -- name: 80.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 2.83 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 2.17 kB N: -- sorted_index: 4.55 MB N: -- type: 64.00 B N: Memory usage [binary:xen-linux-system-3.2.0-4-amd64/3.2.35-2/amd64]: 1.50 MB N: -- base_dir: 152.00 B N: -- changelog: 1525.75 kB N: -- control: 160.00 B N: -- control-index: 2.38 kB N: -- field: 1409.00 B N: -- file_info: 815.00 B N: -- index: 6.22 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 454.00 B N: -- name: 84.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 2.22 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 5.32 kB N: -- type: 64.00 B N: Memory usage [binary:linux-headers-3.2.0-4-rt-amd64/3.2.35-2/amd64]: 6.85 MB N: -- base_dir: 152.00 B N: -- changelog: 1525.75 kB N: -- control: 160.00 B N: -- control-index: 3.02 kB N: -- field: 1.78 kB N: -- file_info: 704.93 kB N: -- index: 4.62 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 770.92 kB N: -- name: 84.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 2.83 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 2.17 kB N: -- sorted_index: 4.54 MB N: -- type: 64.00 B N: Memory usage [binary:linux-support-3.2.0-4/3.2.35-2/all]: 1.52 MB N: -- base_dir: 140.00 B N: -- changelog: 1525.74 kB N: -- control: 148.00 B N: -- control-index: 3.66 kB N: -- field: 1.54 kB N: -- file_info: 2.98 kB N: -- index: 22.93 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 2.45 kB N: -- name: 76.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1.50 kB N: -- scripts: 625.00 B N: -- sorted_control-index: 2.78 kB N: -- sorted_index: 21.78 kB N: -- type: 64.00 B N: -- unpacked: 152.00 B N: Memory usage [binary:linux-headers-3.2.0-4-all/3.2.35-2/amd64]: 1.50 MB N: -- base_dir: 148.00 B N: -- changelog: 1525.74 kB N: -- control: 156.00 B N: -- control-index: 2.38 kB N: -- field: 1525.00 B N: -- file_info: 800.00 B N: -- index: 6.20 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 444.00 B N: -- name: 80.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1.52 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 5.30 kB N: -- type: 64.00 B N: Memory usage [binary:linux-headers-3.2.0-4-all-amd64/3.2.35-2/amd64]: 1.50 MB N: -- base_dir: 152.00 B N: -- changelog: 1525.75 kB N: -- control: 160.00 B N: -- control-index: 2.38 kB N: -- field: 1.53 kB N: -- file_info: 818.00 B N: -- index: 6.23 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 456.00 B N: -- name: 84.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 2.32 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 5.32 kB N: -- type: 64.00 B N: Memory usage [binary:linux-headers-3.2.0-4-common/3.2.35-2/amd64]: 4.46 MB N: -- base_dir: 152.00 B N: -- changelog: 1525.75 kB N: -- control: 160.00 B N: -- control-index: 2.38 kB N: -- field: 1.53 kB N: -- file_info: 374.14 kB N: -- index: 2.48 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 562.20 kB N: -- name: 84.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1393.00 B N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 2.44 MB N: -- type: 64.00 B N: Memory usage [binary:linux-doc-3.2/3.2.35-2/all]: 7.74 MB N: -- base_dir: 132.00 B N: -- changelog: 1525.73 kB N: -- control: 140.00 B N: -- control-index: 2.38 kB N: -- field: 1.67 kB N: -- file_info: 1017.94 kB N: -- index: 5.04 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 1152.63 kB N: -- name: 68.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1393.00 B N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 4.96 MB N: -- type: 64.00 B N: -- unpacked: 144.00 B N: Memory usage [binary:linux-libc-dev/3.2.35-2/amd64]: 2.13 MB N: -- base_dir: 136.00 B N: -- changelog: 1525.74 kB N: -- control: 144.00 B N: -- control-index: 2.38 kB N: -- field: 1.79 kB N: -- file_info: 61.68 kB N: -- index: 541.39 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 102.33 kB N: -- name: 68.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1.83 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 531.17 kB N: -- type: 64.00 B N: -- unpacked: 144.00 B N: Memory usage [binary:linux-headers-3.2.0-4-common-rt/3.2.35-2/amd64]: 4.48 MB N: -- base_dir: 152.00 B N: -- changelog: 1525.75 kB N: -- control: 160.00 B N: -- control-index: 2.38 kB N: -- field: 1.53 kB N: -- file_info: 384.59 kB N: -- index: 2.50 MB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 573.07 kB N: -- name: 84.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 1393.00 B N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 2.46 MB N: -- type: 64.00 B N: Memory usage [binary:linux-source-3.2/3.2.35-2/all]: 1.51 MB N: -- base_dir: 136.00 B N: -- changelog: 1525.74 kB N: -- control: 144.00 B N: -- control-index: 2.38 kB N: -- field: 2.11 kB N: -- file_info: 1237.00 B N: -- index: 9.39 kB N: -- is_non_free: 52.00 B N: -- java_info: 92.00 B N: -- md5sums: 904.00 B N: -- name: 72.00 B N: -- native: 64.00 B N: -- objdump_info: 92.00 B N: -- relation: 5.32 kB N: -- scripts: 92.00 B N: -- sorted_control-index: 1.56 kB N: -- sorted_index: 8.42 kB N: -- type: 64.00 B