Hi,

I'm not sure if I can settle a debate, but I can try. Here's a quote from the docs [0]:

One or more objects in the cluster cannot be found. More precisely, the OSDs know that a new or updated copy of an object should exist, but no such copy has been found on OSDs that are currently online.

This is not the usual case when one copy is corrupted, that would more likely display "inconsistent PG" as an error or warning. And that would still allow reading/writing to that PG. Running

ceph tell <pgid> query

could help understand the issue.

If the primary OSD fails, another one takes over, so you're right about that part, it's not the single source of truth. To me it sounds like you have multiple down OSDs so that min_size cannot be fulfilled.

If you provided more information, we could help a bit more:

ceph -s
ceph osd tree
ceph tell <pgid> query
ceph osd pool ls detail

And the crush rule dump in use by the affected pool.

Regards,
Eugen


[0] https://docs.ceph.com/en/latest/rados/operations/health-checks/#object-unfound

Zitat von Alex <[email protected]>:

Hi everyone.
Help me settle a debate.

My coworker is seeing
OBJECT_UNFOUND and PG_DAMAGED (recovery_unfound) errors.
We both agree they are caused by bad drives.
The fix is to mark the drive as out, replace it and add it back in.
Whenever we see this error on Ceph we see corresponding read errors on
the physical drive.

I'm saying that even though the drive is bad since there are two more copies,
only 1 of 3 drives has bad sectors preventing the data from being accessed
which is what dmesg is showing
ie critical medium error, dev sd..., sector 12345...
There should not be OBJECT_UNFOUND since Ceph compares the remaining
two copies and assuming the data matches,
it should be able to recover on it's own and move the data to another
PG or maybe OBJECT_UNFOUND and PG_DAMAGED are warnings not errors.

My coworker is saying because the primary OSD responsible for
coordinating the PG was the one which failed,
and is the "source of truth" the cluster goes into error state.

His argument doesn't make sense to me since there should be no single
point of failure,
but I'm also not sure about my argument since I don't know enough
about how Ceph works under the hood.

Thanks.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]


_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to