Hi folks,

I expired my access to ppc64el since CUDA (>> 12.4) no longer supports
ppc64el for the trixie+1 cycle. But the recent CUDA 12.2->12.4 transition
requires me to rebuild pytorch-cuda, while I've already lost access.

The help I need is pretty simple -- manuually rebuild pytorch-cuda and
upload the resulting binaries. Note the building process
involves two major non-free dependencies:

(1) nvidia-cuda-toolkit: from non-free section
(2) nvidia-cudnn: this is my installation script to download binary
    blobs during postinst.

They are the direct reason why XS-Autobuild and porterbox do not work.

Steps
=====

1. get the source of pytorch-cuda, make sure version is 2.6.0+dfsg-7

apt source pytorch-cuda

2. do the manual binNMU with sbuild

sbuild --no-clean -c unstable-ppc64el-sbuild \
 --build=ppc64el --arch=ppc64el \
 --make-binNMU="Rebuild against CUDA 12.4." \
 -m "your name <your email>" \
 pytorch-cuda_2.6.0+dfsg-7.dsc -d sid

3. sign the built packages and upload

debsign pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes
dput ftp-master pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes


Parallelism and RAM
===================

On amd64/ppc64el building pytorch-cuda needs 4GB per job to avoid
OOM during parallel link. On arm64 it requires 8GB per job. It is
OK to allocate a large swap as it is largely used to counter the
RAM spikes during parallel linker invokes.

I have already done the amd64 rebuild:
https://buildd.debian.org/status/package.php?p=pytorch%2dcuda

My arm64 rebuild is on the way but it will take roughly one day
with my raspberry pi 5. If you have a stronger arm64 device,
feel free to help the rebuild and upload before I do.
Note, arm64 needs roughly 8GB RAM/swap per job to avoid OOM.

Reply via email to