Hi, On Tue, May 26, 2020 at 06:35:20PM +0200, Val Lorentz wrote: > Thanks for the tip. > > I just tried downgrading an OSD (armhf) and a monitor (amd64) to > 14.2.7-1~bpo10+1 using http://snapshot.debian.org/ ; but they are still > unable to communicate ("failed decoding of frame header: > buffer::bad_alloc"). > > So this might be a different issue, although related.
Well, 14.2.7-~bpo something did work on my armhf osd cluster, with 2 mons running on armhf, and one on proxmox pve 6 running ceph 14.2.8 . What Already did not work was OSD's on AMD64 working together with a 2xarmhf and 1xamd64 mon setup. I had a lot of problems getting it to work at all, but I thought it was just my lack of knowledge at that time. 99% of the problems is with setting up the correct secrets, or in other words, the handling of the "keyrings". Even between amd64 and amd64 this has been buggy if I look at the release notes. Specifically 14.2.6 to 14.2.7 I think. I assume bugs are in authentication, because as long as I did not reboot the amd64 it works. The daemons authenticate using the secrets, and the secret gives an authentication ticket. Anyway: the most simple test is to install a system, rsync /etc/ceph and type in ceph status. It either works (on 32 bits, fix the timeout in the python script, because if you don't it won't work at all) or it doesn't return at all. I will test if it's also the case with armhf ceph cli client to a amd64 cluster. I only have one working amd64 cluster though, and it has 2 fake OSD's, because amd64 clusters are too expensive to experiment with. I have to do some networking hacks though to connect the systems. Anyway: the kernel has no problem talking to either OSD types, so the kernel's protocol handling is implemented correctly, and cephx works between an rbd amd64 or armhf kernel client and armhf userspace. The rbd amd64 userspace utility however does not work at all. As far as I can see it can't get past authentication, but without any logs I am a bit riddled. By the way: the mgr dashboard modules is about 99% correct. The disk space is obviously calculated incorrectly. Regards, Ard -- .signature not found