On Mon, 11 Nov 2024 11:22:26 +0100 Uwe Kleine-König wrote:

[...]
> Hello,

Hi Uwe, thanks for your followup.

> 
> On Thu, Oct 31, 2024 at 07:53:52PM +0100, Francesco Poli (wintermute) wrote:
[...]
> > I filed this bug report against the Debian Linux kernel, in order
> > to warn other users about this issue, and in order to ask the Debian
> > Kernel Team to investigate the issue and/or to forward the bug report
> > to the relevant upstream Linux kernel maintainers.
> > 
> > Please do not reassign to package opensm with the intention of
> > merging with bug [#1085300], unless you know for sure that the
> > issue is in opensm and you know how to fix it.
> 
> Please do not report multiple bugs for the same issue. The right(er)
> thing to do is to make use of "affects". Now there are three bug reports
> (2 for Debian and one upstream) and someone being aware of only one (or
> two) of them, might miss some action which results in duplicate work.

You are right, the "affects" field is the most appropriate means to
show that a bug report against a given package also affects other
packages.

However, in this case, the lack of replies from opensm maintainers made
me doubtful about the best possible course of action. Sorry about that.

>  
> > Please help, I would very much like to run the head node with
> > an up-to-date kernel!
> 
> This is hard to act on without further input. Some questions to debug
> this:
> 
> I guess the kernel provides a directory "/sys/class/infiniband_mad". Do
> its contents look different on 6.10.x and 6.11.x?

I will look into this as soon as I can reboot the cluster head node.

> 
> Can you please bisect the problem?
[...]

I have to find a time window where I can perform multiple reboots,
which can result in a non-working InfiniBand network... It won't be
easy, since the cluster has entered production and users keep launching
jobs.

Anyway, what I have done so far is: I have tried and rebuilt a Linux
kernel image Debian package, following your instructions.
After some failed attempts (due to missing dependencies and/or required
tools), I think I succeeded, but I had to reply to a number of
questions during the procedure: I have always replied with the default
answer (by hitting [Enter]), I hope that was the right thing to do!

Before I go on and try to install the resulting Debian package, could
you please review the transcript of what I did (see the attached file)?

Please bear with me, some of the questions were really obscure to me
and I am not really familiar with the procedure: I think that the last
time I rebuilt a Linux kernel image Debian package was some 15 years
ago (I was still using the now-obsolete [kernel-package]!).

[kernel-package]: <https://tracker.debian.org/pkg/kernel-package>

Thanks for your time and for the help you are providing.


-- 
 http://www.inventati.org/frx/
 There's not a second to spare! To the laboratory!
..................................................... Francesco Poli .
 GnuPG key fpr == CA01 1147 9CD2 EFDF FB82  3925 3E1C 27E1 1F69 BFFE

Attachment: git_bisect.txt.gz
Description: application/gzip

Attachment: pgpMYdcoCS2RM.pgp
Description: PGP signature

Reply via email to