Hi Stefan,
We run the 18.2.7 version.
I turned on the debug logs and this looks really weird:
[DBG] public networks ['PREFIX:22::/64']
[DBG] cluster networks []
[DBG] processing data from 12 hosts
[DBG] checking public network membership for: ['host-20', 'host-13',
'host-19', 'host-18', 'host-12', 'host-16', 'host-14', 'host-15',
'host-22', 'host-17', 'host-23', 'host-21']
[DBG] checking network PREFIX:22::/64
[DBG] subnet data - {"subnet": "PREFIX:22::/64", "mtu_map": {"9100":
["host-13", "host-12", "host-16", "host-14", "host-15", "host-17",
"host-21"]}, "speed_map": {"20000": ["host-13", "host-12", "host-16",
"host-14", "host-17", "host-21"], "10000": ["host-15"]}}
[DBG] processing mtu map : {"9100": ["host-13", "host-12", "host-16",
"host-14", "host-15", "host-17", "host-21"]}
[DBG] MTU problems detected
[DBG] most hosts using 9100
[DBG] processing subnet : {"subnet": "PREFIX:22::/64", "mtu_map": {"9100":
["host-13", "host-12", "host-16", "host-14", "host-15", "host-17",
"host-21"]}, "speed_map": {"20000": ["host-13", "host-12", "host-16",
"host-14", "host-17", "host-21"], "10000": ["host-15"]}}
[DBG] linkspeed issue(s) detected
[DBG] most hosts using 10000
But when I check one of the hosts (host-22). that do not show up in the
subnet or mtu map and run a "cephadm --image
quay.io/ceph/ceph@sha256:1b9158ce28975f95def6a0ad459fa19f1336506074267a4b47c1bd914a00fec0
gather-facts" I get the following for the interface which should have the
prefix
"bond0.22": {
"driver": "",
"iftype": "logical",
"ipv4_address": "",
"ipv6_address": "fe80::.../64",
"lower_devs_list": [
"bond0"
],
"mtu": 9100,
"nic_type": "ethernet",
"operstate": "up",
"speed": 20000,
"upper_devs_list": []
}
But doing a "cephadm --image
quay.io/ceph/ceph@sha256:1b9158ce28975f95def6a0ad459fa19f1336506074267a4b47c1bd914a00fec0
list-networks" I get
...,
"PREFIX:22::/64": {
"bond0.22": [
"PREFIX:22::186"
]
},
...,
"fe80::/64": {
...,
"bond0.22": [
"fe80::..."
],
...
}
}
So why doesn't it show up in the 9100 mtu_map and in the 20000 speed_map?
And why does it use the link local address in the gather-facts section?
I cross checked with some other host that acutally shows up and in the
gather-facts it got the correct prefix, but the list-networks look the same
with the fe80::/64 network.
Is there a way to priortize the configured IP addresses in the output and
only use the link-local addresses when there is no other IP address?
I really wouldn't like to disable the link-local address, because I want to
have the ceph-nodes in different clusters configured as same as possible.
We have some clusters that use RA on the mgmt interface and some that have
a static gateway configured.
Am Mi., 17. Sept. 2025 um 09:06 Uhr schrieb Stefan Kooman <[email protected]>:
> On 9/16/25 18:34, Boris wrote:
> > Hi,
> >
> > I am currently debugging an issue with the ceph config checks.
> > We have some random hosts that alert
> >
> > "HOSTNAME does not have an interface on any public network"
> >
> > but they have. It is IPv6, static configured and, because we don't have a
> > cluster_network, OSDs are bound to that IP in the specific network.
> >
> > I went through the netplan config and they are basically the same on all
> > hosts.
> > And after rebooting the hosts some of them resolved and some didn't.
> >
> > How can I dig deeper to figure out what is going on.
>
>
> Can you find some log output related to these events (docu here [1])?
>
> >
> > All services are ceph-orch podman containers
> > All hosts are ubuntu 22.04 with latest HWE kernel (6.8.0-79-generic)
>
> What version of Ceph are you running? We have ubuntu 22.04 clusters
> configured exactly like this (IPv6 only), so I'm really curious. We
> haven't seen this behavior yet (18.2.4).
>
> Gr. Stefan
>
> [1]:
>
> https://docs.ceph.com/en/latest/cephadm/operations/#watching-cephadm-log-messages
>
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]