The fix for the bug looks at the byte 4 bytes ahead (if the current
position is 0x3C9, then the code would look at the byte at 0x3CD). In
the squid release the byte most likely would be 0 (it could be non-zero
for 4GiB+ extended attributes, which is highly unlikely). In the squid
git snapshot from
The following patch was used to get the hexdumps above:
diff --git a/src/mon/MDSMonitor.cc b/src/mon/MDSMonitor.cc
index 76a57ac443..d36bed2257 100644
--- a/src/mon/MDSMonitor.cc
+++ b/src/mon/MDSMonitor.cc
@@ -143,6 +143,7 @@ void MDSMonitor::update_from_paxos(bool *need_bootstrap)
ceph_asser
** Attachment removed: "src/mds/MDSMap: decode max_xattr_size and bal_rank_mask
in the right order"
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2089565/+attachment/5859087/+files/mds-MDSMap-decode-max_xattr_size-and-bal_rank_mask.patch
--
You received this bug notification because yo
(this description of the fix is added to the patch as well)
bal_rank_mask is stored as a text string with a decimal representation
of a number inside. The string is stored as length of the string (4
bytes, little endian) and then the string itself (without trailing 0,
just the string itself).
max
** Changed in: ceph (Ubuntu)
Status: Confirmed => In Progress
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2089565
Title:
MON and MDS crash upgrading CEPH on ubuntu 24.04 LTS
To manage no
** Changed in: ceph (Ubuntu)
Assignee: (unassigned) => Maksym Medvied (medvied)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2089565
Title:
MON and MDS crash upgrading CEPH on ubuntu 24
Now we see that the dir with the Ceph source is is ceph-19.2.0. Let's
create a symlink so gdb would be able to find it:
> sudo ln -sv ceph-19.2.0 ceph-19.2.0-0ubuntu0.24.04.1
'ceph-19.2.0-0ubuntu0.24.04.1' -> 'ceph-19.2.0'
Let's restart gdb with ceph-mon again:
(gdb) start
Temporary breakpoint 1
The addresses here are not continuous, so it makes sense to look at the
full disassembled version as well (i.e. disassemble without /m):
(gdb) disassemble
'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl&)'
Dump of assembler code for function
_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04li
As we see in the diff above
if (ev >= 17) {
-decode(max_xattr_size, p);
+decode(bal_rank_mask, p);
}
if (ev >= 18) {
-decode(bal_rank_mask, p);
+decode(max_xattr_size, p);
+ }
+
these two decode() calls were swapped. Let's find out why.
To do so we need to clone the up
git clone https://git.launchpad.net/ubuntu/+source/ceph
cd ceph
> git grep -n MDSMap::decode
src/mds/FSMap.cc:1086: * Insert INLINE; see comment in MDSMap::decode.
src/mds/MDSMap.cc:836:void MDSMap::decode(bufferlist::const_iterator& p)
So we're interested in src/mds/MDSMap.cc (if the file was
Let's find this offset in the disassembled function:
(gdb) disassemble/m
'MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl&)'
Dump of assembler code for function
_ZN6MDSMap6decodeERN4ceph6buffer7v15_2_04list13iterator_implILb1EEE:
Address range 0x77cc2e10 to 0x77cc3c4d:
837
This is the SIGABRT stack backtrace:
1: /lib/x86_64-linux-gnu/libc.so.6(+0x45320) [0x749752045320]
2: pthread_kill()
3: gsignal()
4: abort()
5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa5ff5) [0x7497524a5ff5]
6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da) [0x7497524bb0da]
7: (std::unexpec
The root cause of this bug is that the on-wire representation changed
between the git snapshot 19.2.0~git20240301.4c76c50-0ubuntu6 and the
squid release 19.2.0-0ubuntu0.24.04.1, so the cluster couldn't be
upgraded without downtime. We don't have upgrade tests from the snapshot
to the squid release,
13 matches
Mail list logo