Hi,

On Tue, May 26, 2020 at 06:35:20PM +0200, Val Lorentz wrote:
> Thanks for the tip.
> 
> I just tried downgrading an OSD (armhf) and a monitor (amd64) to
> 14.2.7-1~bpo10+1 using http://snapshot.debian.org/ ; but they are still
> unable to communicate ("failed decoding of frame header:
> buffer::bad_alloc").
> 
> So this might be a different issue, although related.

Well, 14.2.7-~bpo something did work on my armhf osd cluster,
with 2 mons running on armhf, and one on proxmox pve 6 running
ceph 14.2.8 .
What Already did not work was OSD's on AMD64 working together
with a 2xarmhf and 1xamd64 mon setup.
I had a lot of problems getting it to work at all, but I thought
it was just my lack of knowledge at that time. 99% of the
problems is with setting up the correct secrets, or in other
words, the handling of the "keyrings". Even between amd64 and
amd64 this has been buggy if I look at the release notes.
Specifically 14.2.6 to 14.2.7 I think.
I assume bugs are in authentication, because as long as I did not
reboot the amd64 it works.
The daemons authenticate using the secrets, and the secret gives
an authentication ticket.

Anyway: the most simple test is to install a system, rsync
/etc/ceph and type in ceph status. It either works (on 32 bits,
fix the timeout in the python script, because if you don't it
won't work at all) or it doesn't return at all.

I will test if it's also the case with armhf ceph cli client to a
amd64 cluster. I only have one working amd64 cluster though, and
it has 2 fake OSD's, because amd64 clusters are too expensive to
experiment with.
I have to do some networking hacks though to connect the systems.

Anyway: the kernel has no problem talking to either OSD types, so
the kernel's protocol handling is implemented correctly, and
cephx works between an rbd amd64 or armhf kernel client and armhf
userspace.
The rbd amd64 userspace utility however does not work at all. As
far as I can see it can't get past authentication, but without
any logs I am a bit riddled.

By the way: the mgr dashboard modules is about 99% correct. The
disk space is obviously calculated incorrectly.

Regards,
Ard

-- 
.signature not found

Reply via email to