[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2025-01-06 Thread Philip Cox
** Description changed: + + SRU Justification: + + [Impact] + + There is a kernel hard lockup where all CPUs are stuck acquiring an + already-locked spinlock (css_set_lock) within the cgroup subsystem when + the user is running a certain eBPF program. + + This has been hit in focal 5.15 backpo

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2025-01-06 Thread Philip Cox
** No longer affects: linux-aws-5.15 (Ubuntu Focal) ** No longer affects: linux (Ubuntu Focal) ** Also affects: linux (Ubuntu Jammy) Importance: Undecided Status: New ** Also affects: linux-aws-5.15 (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux (Ubunt

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2025-01-06 Thread Philip Cox
Hi Max, I plan to send the review out today, or tomorrow. I am aiming to get it in the 2025.01.13 SRU cycle. The 2025.01.13 SRU cycle has a patch cut-off date of 08-Jan, and a planned release date the week of 10-Feb. Once I send the patch out for review, I will update the ticket. I will update

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2025-01-06 Thread Max Wolffe
Hey friend - I hope you are well and had good holidays. Just checking in here to understand when we're likely to be able to pull the fix from Ubuntu mainline. Thanks in advance! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://b

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-20 Thread Philip Cox
Max, thanks for doing the testing on this, that helped a lot too. I will submit the get the patch to the Ubuntu kernels, and probably the upstream stable kernel as well in the new year. Have a good holiday season too. -- You received this bug notification because you are a member of Ubuntu Bugs

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-20 Thread Max Wolffe
Philip - we have no reports of kernel hangs in our staging environment since the deployment. I think we can consider the patch to have fixed the issue. Thank you again for your hard work getting this patched! Have a lovely holiday season. -- You received this bug notification because you are a m

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-19 Thread Max Wolffe
Thanks for your patience on this testing. Just confirming here that Azure, GCP, AWS all built correctly with your change, we deployed the fix this afternoon and are monitoring for any issues - will report results tomorrow morning. -- You received this bug notification because you are a member of

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-17 Thread Philip Cox
Not a problem. Once you confirm, I will submit the patches to the generic Ubuntu kernel, and it will land in the gcp/aws/azure kernels via the regular SRU update route. I will update the ticket when I've sent out the review for the patches. -- You received this bug notification because you are

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-17 Thread Max Wolffe
Thank you Philip - we can only reproduce this in our own environment at high load - so I think it will be hard to reproduce in a small environment. I will test this today though and confirm the fix, thank you again for your help :D -- You received this bug notification because you are a member of

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-17 Thread Philip Cox
I've uploaded packages for gcp and azure for you to test. I also looked into why the Arm build didn't work, and have resolved that. I updated the aws focal packages and version 1-100 now is building for arm64 and amd64 in the PPA. It should be ready to test shortly. gcp and azure only have

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-16 Thread Max Wolffe
Totally understand, thank you again for your help Philip! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089318 Title: kernel hard lockup 5.15.0-1072-aws To manage notifications about this bug go t

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-16 Thread Philip Cox
Thanks for the update. I will look into the ARM build. I will try to put together the gcp and azure test kernels. At the end of the week I will be off until the new year, so I will do what I can to get something to you before then, but if I can't, I apologise. And thank you for testing the buil

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-13 Thread Max Wolffe
Also - the build seems to work well for x86, but I get the following for ARM: ``` root@ip-172-31-59-2:/home/ubuntu# apt list | grep WARNING: apt does not have a stable CLI interface. Use with caution in scripts. invesalius-bin/focal 3.1.2-3build2 arm64 invesalius-examples/focal 3.1.

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-13 Thread Max Wolffe
Thank you sir. Yes, that understanding is correct: We're running Ubuntu 20.04 - 5.15 for Azure and GCP both. And just so I understand - once this is merged into the Ubuntu 5.15 branch, we'll likely be able to pick it up from the official ubuntu source in the new year? -- You received this bug n

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-13 Thread Philip Cox
Max, I will see about gcp and azure. If they are the 5.15 based kernel, it shouldn't be too bad. I am planning on putting the fix into all Ubuntu kernels, so gcp, azure, and aws will pick them up via the SRU update process, but that won't happen until the new year as the last SRU cycle for the ye

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-13 Thread Max Wolffe
Thank you Philip - we're testing this now, but the building is looking promising so far. Would it be possible for us to get similar packages in azure/gcp as well? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-13 Thread Philip Cox
I've just uploaded the focal packages and they are building in the PPA. You should be able to install and test it once it finishes building. It is the same PPA as before. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.la

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-12 Thread Philip Cox
I forgot it was focal. I’ll upload a 5.15 based focal package tomorrow. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089318 Title: kernel hard lockup 5.15.0-1072-aws To manage notifications about

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-12 Thread Max Wolffe
Hey Philip, this is great thank you - is there any way we could get this for Focal? Or is there an easy way for me to install this kernel for focal from your PPA? When I list focal releases for your ppa I get the following: ``` > sudo apt --allow-unauthenticated update ... Err:17 http://ppa.launc

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-12 Thread Philip Cox
Alright, I have uploaded the changes to a PPA and it is currently building. The PPA is at: https://launchpad.net/~philcox/+archive/ubuntu/lp2089318-kernel-hard- lockup There are directions on how to add the PPA on that page, and they list it as: --- You can update your system with unsuppo

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-12 Thread Max Wolffe
Amazing, thank you Philip! Looking forward to testing :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089318 Title: kernel hard lockup 5.15.0-1072-aws To manage notifications about this bug go to

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-12 Thread Philip Cox
Hi Max, I have back ported the change and did some very basic testing on it. I haven't hit any issues with it. I will build it in a PPA and provide you with instructions on how to install the kernel if you would like to test it. It will likely take a few hours for it to build in the PPA, so I wi

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-09 Thread Max Wolffe
Thank you Philip! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089318 Title: kernel hard lockup 5.15.0-1072-aws To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+so

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-09 Thread Philip Cox
The backport of the fix is a little trickier than I had expected. I will update you when I have something to test. I should have it sometime this week I hope. ** Changed in: linux-aws-5.15 (Ubuntu) Status: Triaged => In Progress -- You received this bug notification because you are a me

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-02 Thread Philip Cox
Hi Max, Thanks for the update, and the work involved in this! At a first glance, it does seem reasonable. I will go over it in my depth, and get back to you, but I just wanted to answer the other questions first: > 2. If 1) can Canonical backport this fix to the 5.15 and 5.0.4-fips kernels?

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-12-02 Thread Max Wolffe
Philip Cox - I think we have an RCA. Below is the call stack of “iptables” at the moment of the hang (which is same across all collected kernel dumps): ``` crash> bt 25894 PID: 25894TASK: 89094bce8000 CPU: 1COMMAND: "iptables" #0 [adb9456ab8f8] __schedule at a5ba8b8d #1

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-11-28 Thread Max Wolffe
Hey Philip, Thank you for the response. We think we've isolated an eBPF program we're running which might cause this interaction, I'll see on Monday if I can get you some more information to help debug. > 1) Can you please run the command: apport-collect 2089318 Will aim to get you this o

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-11-27 Thread Philip Cox
Hi Max, thank you for taking the time to report this issue. I would like to clarify a few things, and perhaps get some additional information on the issue you are hitting. 1) Can you please run the command: apport-collect 2089318 preferably right after hitting this issue (on the next reboo

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-11-27 Thread Philip Cox
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Philip Cox (philcox) ** Changed in: linux (Ubuntu Jammy) Assignee: (unassigned) => Philip Cox (philcox) ** Changed in: linux-aws-5.15 (Ubuntu) Assignee: (unassigned) => Philip Cox (philcox) ** Changed in: linux-aws-5.15 (Ubun

[Bug 2089318] Re: kernel hard lockup 5.15.0-1072-aws

2024-11-22 Thread Agathe Porte
** Also affects: linux-aws-5.15 (Ubuntu) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Jammy) Importance: Undecided Status: New ** Also affects: linux-aws-5.15 (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux (Ubuntu) S