On Mon, Jul 29, 2024, at 10:28 PM, Andres Salomon wrote:
> On 7/29/24 16:10, Soren Stoutner wrote:
>> On Monday, July 29, 2024 1:18:05 AM MST Andres Salomon wrote:
>>> It's unfortunately going to have to wait. We're switching standard
>>> libraries, and linking to external libs is a bit rocky right now.
>> 
>> Waiting until things settle is fine.  This has been an issue for so long 
>> that I
>> have become a patient man.
>>   
>>> On the plus side, I reduced the time it takes to generate the
>>> orig.tar.xz from ~40 minutes to ~5 minutes, which should help a lot with
>>> testing the deletion of vendored libraries in the future!
>> 
>> That’s impressive.  How did you accomplish that?
>> 
>
> Debian's mk-origtargz script (which is what uscan calls) doesn't work 
> for us, because 'tar --delete' doesn't scale as d/copyright's 
> Files-Excluded increases (see #995770).
>
> Mike (prior chromium maintainer) instead patched mk-origtargz to (1) 
> print out the files that would be deleted, (2) untar the _entire_ 
> upstream chromium tarball (which at this point is huge at 6.2GB), then 
> (3) loops over the list of files to delete, deleting them one-by-one and 
> then (4) packing up the new tarball. It worked okay when chromium's 
> upstream tarball was roughly 1GB, but it has really ballooned lately.
>
> I replaced the first three steps with a single 'tar --exclude-from', so 
> that we save time by not writing deleted files to disk only to manually 
> delete them:
> https://salsa.debian.org/chromium-team/chromium/-/commit/cd5bf2ed6c848ea054718d8f658aa2b38c681d2c
>
> I would love to get this into mk-origtargz proper so that chromium could 
> use uscan (and also everyone in debian maintaining larger packages would 
> benefit), but I'm not even sure where to begin. Maybe as a separate 
> python mk-origtar tool? Maybe as a patch to mk-origtargz with a 
> command-line option to fall back to tar --delete? Perhaps d-d has an idea.

FWIW, having this supported in uscan (I don't really care *how* that would be 
implemented tbh ;)) would be great and save me about an hour or so waiting for 
repeated repacking every few weeks when updating rustc/cargo. I assume there's 
a few other packages that do involved pruning of bloated upstream tarballs like 
that that would also benefit. For Rust, we remove about 2/3 of the upstream 
tarball[0], both file size and file count wise, but it's nowhere near close to 
what src:chromium does (or rather, has to do). It's a mix of embedded copies of 
other projects (e.g., LLVM) that we don't want since we use those provided by 
standalone packages, and removing toolchain components and their vendored deps 
that are not used for the Debian build. Technically we could also keep all of 
that in (or at least, greatly reduce the exclusion list, almost none if it 
would be undistributable), but it would make both ensuring the build doesn't 
accidentally pick any of the undesired things up, as well as keeping 
d/copyright current, a lot more difficult.

0: 
https://salsa.debian.org/rust-team/rust/-/blob/debian/sid/debian/copyright?ref_type=heads#L4-274

Reply via email to