Hi lynxis, On Mon, Mar 25, 2019 at 11:04 PM Alexander Couzens <lyn...@fe80.eu> wrote: > > I'm playing with the different patched versions for hours now. > > Indeed, it seems with your latest patch the memory leak is bearable. > > Nice to hear. I compressed a 8GB tree of toolchains and rootfs with > valgrind (took > 10h ;) without any noticable leaks. I used a smaller directory for rootfs but run several times with the different patched squashfs-tools. The -11 package version used ten times more memory it was limited to due to the leak. With your fix, it still used more but max ten percent more.
> > I've never tested CPU usage before. I've realized it's rhapsodical, > > sometimes all my CPU cores utilized on 100%, sometimes one on ~15%. > > Should I wait for other fix(es)? > > No. I'ven't looked into it. But this is a long standing issue. But is it caused by your patch or upstream version has the same problem as well? > First of all, as squashfs-tools wasn't written in the last years, when > reproducible builds became more famous. So it's not written > with reproducible building in mind. > For me is reproducible builds more important than using all cpu cores. > But I don't use it with gigabytes images. Yeah, it's quite an old software without real development in the recent years. I do agree that reproducibility is important, but most users might not want it. Everyone would like their squashfs generated as quick as possible especially if they have the HW (CPU with many cores) for that. I don't want them to experience degraded performance. > It got a bit worse with removing the frag_deflator thread, as there > is one workload thread less. The old frag_deflator thread has been > spawnd $cpus times. fragments are small files. Then it's strange. Even with the frag_deflator thread removed, sometimes all my CPU cores are used. Sometimes not, but as I understand you, I should try different compression algorithms. > Maybe there is a different approach to do it reproducible. Create > first an index over all files, ensure a proper order through multiple > queues via an index. But I'm not sure, if this would be really faster > than it's now. Btw: even with `-processors 1` it will spawn multiple > threads which should use more than 1 core. This sounds more complex work than it can be achieved in this week. Maybe a complete rewrite would be better then on the long run. Cheers, Laszlo/GCS