M. Zhou wrote: > Just one comment. > > Be careful if it bloats up our mirrors. Is there any estimate on > the extra space cost for a full debian mirror? > > If we trade-off the disk space with decompression speed, zstd -19 > is not necessarily very fast. I did not benchmark, but it is slow.
Anecdotally, for "linux-image-6.5.0-1-amd64_6.5.3-1_amd64.deb" the data.tar component takes 72 MB when compressed with xz -6, and 80 MB when compressed with zstd -19, so about 10% larger with zstd. This is specifically with multi-threaded compression. The behavior of -T0 between xz and zstd is slightly different. (It looks like xz -T0 uses the number of threads supported by the CPU while zstd -T0 uses the number of physical cores in the CPU.) The most direct multi-threaded compression comparison between the two compressors was: $ time xz -v -k -T0 -6 data.tar data.tar (1/1) 100 % 71.9 MiB / 452.5 MiB = 0.159 21 MiB/s 0:21 Performance counter stats for 'xz -v -k -T0 -6 data.tar': 206,070.39 msec task-clock # 9.602 CPUs utilized 10,333 context-switches # 50.143 /sec 35 cpu-migrations # 0.170 /sec 73,502 page-faults # 356.684 /sec 925,351,049,292 cycles # 4.490 GHz 945,596,486,369 instructions # 1.02 insn per cycle 106,039,632,660 branches # 514.580 M/sec 6,702,750,057 branch-misses # 6.32% of all branches 21.460119122 seconds time elapsed 205.460711000 seconds user 0.567559000 seconds sys Versus: $ time zstd -T0 --auto-threads=logical -19 data.tar data.tar : 17.54% ( 452 MiB => 79.3 MiB, data.tar.zst) Performance counter stats for 'zstd -T0 --auto-threads=logical -19 data.tar': 293,120.46 msec task-clock # 8.649 CPUs utilized 21,754 context-switches # 74.215 /sec 78 cpu-migrations # 0.266 /sec 9,806 page-faults # 33.454 /sec 1,317,565,940,985 cycles # 4.495 GHz 1,430,204,017,430 instructions # 1.09 insn per cycle 266,246,644,005 branches # 908.318 M/sec 5,762,322,300 branch-misses # 2.16% of all branches 33.889831439 seconds time elapsed 292.501337000 seconds user 0.567560000 seconds sys So, 71.9 MiB in 21 seconds for xz -6 versus 79.3 MiB in 34 seconds for zstd -19. In other words, xz is 91% the size and 63% the wallclock time of zstd here. zstd decompression is much, much faster than xz decompression, but apparently zstd does not support multi-threaded decompression while xz does. Here xz decompresses in about 120% the wallclock time of zstd (about 0.6 seconds for xz vs 0.5 seconds for zstd) but is only able to perform that well by occupying most of the CPU: $ time xzcat -v -T12 data.tar.xz > /dev/null data.tar.xz (1/1) 100 % 71.9 MiB / 452.5 MiB = 0.159 Performance counter stats for 'xzcat -v -T12 data.tar.xz': 5,434.51 msec task-clock # 8.720 CPUs utilized 1,187 context-switches # 218.419 /sec 22 cpu-migrations # 4.048 /sec 24,119 page-faults # 4.438 K/sec 24,311,239,346 cycles # 4.473 GHz 21,196,398,588 instructions # 0.87 insn per cycle 2,841,057,067 branches # 522.781 M/sec 296,751,808 branch-misses # 10.45% of all branches 0.623224953 seconds time elapsed 5.304562000 seconds user 0.127532000 seconds sys $ time zstdcat -v -T12 data.tar.zst > /dev/null Warning : decompression does not support multi-threading Performance counter stats for 'zstdcat -v -T12 data.tar.zst': 559.03 msec task-clock # 1.075 CPUs utilized 4,245 context-switches # 7.593 K/sec 5 cpu-migrations # 8.944 /sec 1,032 page-faults # 1.846 K/sec 2,519,428,855 cycles # 4.507 GHz 5,752,165,946 instructions # 2.28 insn per cycle 943,510,461 branches # 1.688 G/sec 17,026,238 branch-misses # 1.80% of all branches 0.520219563 seconds time elapsed 0.518084000 seconds user 0.044177000 seconds sys If xzcat is restricted to a single core the performance is much worse (about 3.5 seconds for xz vs 0.5 seconds for zstd), although I understand from another post in the thread that dpkg performs multi-threaded xz decompression. This is on an ordinary "Intel(R) Xeon(R) E-2236 CPU @ 3.40GHz" CPU which is a four year old, 6 core, 12 thread processor. -- Robert Edmonds edmo...@debian.org