After ~3 uneventful weeks after upgrading from 15.2.17 to 16.2.14 I’ve started
seeing OSD crashes with "cur >= fnode.size” and "cur >= p.length”, which seems
to be resolved in the next point release for pacific later this month, but
until then, I’d love to keep the OSDs from flapping.
> $ for crash in $(ceph crash ls | grep osd | awk '{print $1}') ; do ceph crash
> info $crash | egrep "(assert_condition|crash_id)" ; done
> "assert_condition": "cur >= fnode.size",
> "crash_id":
> "2024-01-03T09:07:55.698213Z_348af2d3-d4a7-4c27-9f71-70e6dc7c1af7",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-03T14:21:55.794692Z_4557c416-ffca-4165-aa91-d63698d41454",
> "assert_condition": "cur >= fnode.size",
> "crash_id":
> "2024-01-03T22:53:43.010010Z_15dc2b2a-30fb-4355-84b9-2f9560f08ea7",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-04T02:34:34.408976Z_2954a2c2-25d2-478e-92ad-d79c42d3ba43",
> "assert_condition": "cur2 >= p.length",
> "crash_id":
> "2024-01-04T21:57:07.100877Z_12f89c2c-4209-4f5a-b243-f0445ba629d2",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-05T00:35:08.561753Z_a189d967-ab02-4c61-bf68-1229222fd259",
> "assert_condition": "cur >= fnode.size",
> "crash_id":
> "2024-01-05T04:11:48.625086Z_a598cbaf-2c4f-4824-9939-1271eeba13ea",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-05T13:49:34.911210Z_953e38b9-8ae4-4cfe-8f22-d4b7cdf65cea",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-05T13:54:25.732770Z_4924b1c0-309c-4471-8c5d-c3aaea49166c",
> "assert_condition": "cur >= p.length",
> "crash_id":
> "2024-01-05T16:35:16.485416Z_0bca3d2a-2451-4275-a049-a65c58c1aff1”,
As noted in
https://lists.ceph.io/hyperkitty/list/[email protected]/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
<https://lists.ceph.io/hyperkitty/list/[email protected]/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/>
> You can apparently work around the issue by setting
> 'bluestore_volume_selection_policy' config parameter to rocksdb_original.
However, after trying to set that parameter with `ceph config set osd.$osd
bluestore_volume_selection_policy rocksdb_original` it doesn’t appear to set?
> $ ceph config show-with-defaults osd.0 | grep
> bluestore_volume_selection_policy
> bluestore_volume_selection_policy use_some_extra
> $ ceph config set osd.0 bluestore_volume_selection_policy rocksdb_original
> $ ceph config show osd.0 | grep bluestore_volume_selection_policy
> bluestore_volume_selection_policy use_some_extra default
> mom
This, I assume, should reflect the new setting, however it still shows the
default “use_some_extra” value.
But then this seems to imply that the config is set?
> $ ceph config dump | grep bluestore_volume_selection_policy
> osd.0 dev bluestore_volume_selection_policy
> rocksdb_original *
> [snip]
> osd.9 dev bluestore_volume_selection_policy
> rocksdb_original *
Does this need to be set in ceph.conf or is there another setting that also
needs to be set?
Even after bouncing the OSD daemon, `ceph config show` still reports
“use_some_extra"
Appreciate any help they can offer to point me towards to bridge the gap
between now and the next point release.
Thanks,
Reed
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]