[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-03-08 Thread Launchpad Bug Tracker
This bug was fixed in the package ceph - 19.2.1-0ubuntu1 --- ceph (19.2.1-0ubuntu1) plucky; urgency=medium * New upstream stable release (LP: #2097605). * d/p/snapshot-upgrade-fix.patch: Fix upgrade crashing (LP: #2089565). * d/p/dout-fix.patch: remove, unnecessary for new point

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-03-07 Thread Peter Sabaini
** Description changed: - This issue is a continuation of - https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2065515 + [ Impact ] - On Ubuntu 24.04 lts we did upgrade Ceph to 19.2.0-0ubuntu0.24.04.1 - - Previous release is : 19.2.0~git20240301.4c76c50-0ubuntu6 - - whenever upgrading (

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-03-05 Thread James Page
Only Noble and Oracular will have had the snapshot version that caused this problem so no need to fix in Plucky. ** Also affects: ceph (Ubuntu Oracular) Importance: Undecided Status: New ** Also affects: ceph (Ubuntu Plucky) Importance: Undecided Assignee: Maksym Medvied (medvie

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-21 Thread Maksym Medvied
The fix for the bug looks at the byte 4 bytes ahead (if the current position is 0x3C9, then the code would look at the byte at 0x3CD). In the squid release the byte most likely would be 0 (it could be non-zero for 4GiB+ extended attributes, which is highly unlikely). In the squid git snapshot from

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-21 Thread Maksym Medvied
The following patch was used to get the hexdumps above: diff --git a/src/mon/MDSMonitor.cc b/src/mon/MDSMonitor.cc index 76a57ac443..d36bed2257 100644 --- a/src/mon/MDSMonitor.cc +++ b/src/mon/MDSMonitor.cc @@ -143,6 +143,7 @@ void MDSMonitor::update_from_paxos(bool *need_bootstrap) ceph_asser

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-21 Thread Maksym Medvied
** Attachment removed: "src/mds/MDSMap: decode max_xattr_size and bal_rank_mask in the right order" https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2089565/+attachment/5859087/+files/mds-MDSMap-decode-max_xattr_size-and-bal_rank_mask.patch -- You received this bug notification because yo

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-20 Thread Ubuntu Foundations Team Bug Bot
The attachment "src/mds/MDSMap: decode max_xattr_size and bal_rank_mask in the right order" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team. [This is an automated m

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-20 Thread Maksym Medvied
(this description of the fix is added to the patch as well) bal_rank_mask is stored as a text string with a decimal representation of a number inside. The string is stored as length of the string (4 bytes, little endian) and then the string itself (without trailing 0, just the string itself). max

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-02-19 Thread Maksym Medvied
** Changed in: ceph (Ubuntu) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089565 Title: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS To manage no

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-01-27 Thread Maksym Medvied
** Changed in: ceph (Ubuntu) Assignee: (unassigned) => Maksym Medvied (medvied) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089565 Title: MON and MDS crash upgrading CEPH on ubuntu 24.04 L

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-01-27 Thread Guillaume COEUGNET
Hi ! As far as I know, upgrade your whole cluster to 19.2.0-0ubuntu0.24.04.2 from 19.2.0~git20240301.4c76c50-0ubuntu6 will not solve the problem. I've tried that already and all nodes's ceph-mon and ceph-mds crashed at start with SIGABRT. Event ceph-osd, which was succeed to start after upgrade, i

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-01-26 Thread Jan Evert van Grootheest
Hi Stephane, thanks for a swift response! I can manage to bring the whole cluster down and restart at the new version. I.e. I can manage a few minutes without cluster (for example during a weekend). Would that work to update everything? If I understand correctly, the issue is in the network mes

Re: [Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-01-25 Thread stephane maubian
Greetings, unrortunnaly for now solution is to put the release on hold until a new apt package is released, except you want to build from sources > > Le 25 janv. 2025 à 14:25, Jan Evert van Grootheest > <2089...@bugs.launchpad.net> a écrit : > > > May I ask, how t

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2025-01-25 Thread Jan Evert van Grootheest
May I ask, how to recover from this situation? I.e. currently I have a cluster using 19.2.0~git20240301.4c76c50-0ubuntu6. The latest version as of today is 19.2.0-0ubuntu0.24.04.2. I've read through the ceph bugreport [1] and a related commit [2]. Also the discussion following [3] appears related

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
Now we see that the dir with the Ceph source is is ceph-19.2.0. Let's create a symlink so gdb would be able to find it: > sudo ln -sv ceph-19.2.0 ceph-19.2.0-0ubuntu0.24.04.1 'ceph-19.2.0-0ubuntu0.24.04.1' -> 'ceph-19.2.0' Let's restart gdb with ceph-mon again: (gdb) start Temporary breakpoint 1

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
The addresses here are not continuous, so it makes sense to look at the full disassembled version as well (i.e. disassemble without /m): (gdb) disassemble 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl&)' Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04li

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
As we see in the diff above if (ev >= 17) { -decode(max_xattr_size, p); +decode(bal_rank_mask, p); } if (ev >= 18) { -decode(bal_rank_mask, p); +decode(max_xattr_size, p); + } + these two decode() calls were swapped. Let's find out why. To do so we need to clone the up

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
git clone https://git.launchpad.net/ubuntu/+source/ceph cd ceph > git grep -n MDSMap::decode src/mds/FSMap.cc:1086: * Insert INLINE; see comment in MDSMap::decode. src/mds/MDSMap.cc:836:void MDSMap::decode(bufferlist::const_iterator& p) So we're interested in src/mds/MDSMap.cc (if the file was

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
Let's find this offset in the disassembled function: (gdb) disassemble/m 'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl&)' Dump of assembler code for function _ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE: Address range 0x77cc2e10 to 0x77cc3c4d: 837

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
This is the SIGABRT stack backtrace: 1: /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x749752045320] 2: pthread_kill() 3: gsignal() 4: abort() 5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa5ff5) [0x7497524a5ff5] 6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da) [0x7497524bb0da] 7: (std::unexpec

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-21 Thread Maksym Medvied
The root cause of this bug is that the on-wire representation changed between the git snapshot 19.2.0~git20240301.4c76c50-0ubuntu6 and the squid release 19.2.0-0ubuntu0.24.04.1, so the cluster couldn't be upgraded without downtime. We don't have upgrade tests from the snapshot to the squid release,

[Bug 2089565] Re: MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS

2024-12-02 Thread Jan Evert van Grootheest
** Summary changed: - Issue upgrading CEPH on ubuntu 24.04 LTS + MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089565 Title: MON and MDS cr