I've resharded the bucket index to 191, which worked out of the box.
The radosgw-admin bucket check --bucket BUCKET --fix --check-objects
command now worked and resulted in
"calculated_header": {
"usage": {
"rgw.none": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 15872659
}
}
}
We are in the process of deleting the bucket with radosgw-admin bucket rm
--bucket BUCKET --purge-objects
Running the radosgw-admin bucket check olh --bucket BUCKET --fix throws out
a lot of
2025-05-12T09:17:24.787+0000 73d0dcd3e980 -1 ERROR failed to update olh
for: OBJECT update_olh(): (125) Operation canceled
2025-05-12T09:17:24.788+0000 73d0dcd3e980 1 NOTICE: finished shard SHARDID
(0 entries removed)
So I guess this is already pretty bad.
For now I hope that the deletion goes through, but it seems to take a
looooooooooooong time (two hours to get the calculated num_objects from
16053307 to 15872659 ~= 180k, which results in a week until it is
finished).
Am Di., 13. Mai 2025 um 15:15 Uhr schrieb Enrico Bocchi <
[email protected]>:
> Hi Boris,
>
> We have experienced PGs going laggy in the past with a lower level rados
> command to list omapkeys from the BI objects in the metadata pool.
>
> 20M objects in 11 shards gives ~1.8M objects per shard, which is indeed
> a lot.
> If you are manually resharding a versioned bucket, consider that the
> number of objects reported by radosgw-admin bucket stats may not be
> accurate. Namely, it does not take into account that versioned objects
> produce 4 (iirc...) entries in the bucket index instead of 1 entry for
> non-versioned buckets. Also, be careful when deleting objects from a
> versioned or versioned-suspended bucket as you have to specify the
> versionID if you really want to get rid of the object. Else, the object
> logical head (olh) will point to a delete marker, but the object (and
> its entry in the index) will stay around, not cleaning up your BIs. More
> on this on the S3 protocol docs.
>
> Cheers,
> Enrico
>
>
> On 5/9/25 19:49, Boris wrote:
> >> resharding the bucket is indeed the solution. while resharding does
> >> have to read all of the keys from the source index objects, it doesn't
> >> read all of them at once. writing these keys to the target bucket
> >> index objects is the more expensive part, but those are different
> >> objects/pgs and should be better distributed
> > Ah great. Will try that.
> >
> > Ensure the index pool is really on only SSDs. I’ve seen crush rules not
> >> specifying device class.
> >>
> > Yes they are. On dedicated NVMEs that we use for the meta pools (the
> > listing I've sent in slack).
> >
> > Do you have autoresharding disabled? Versioned objects? Can you do a
> >> bilog trim? Could you preshard a new bucket and move the objects?
> >>
> > No autoresharding enabled. We wanted to test sharding in the multisite
> > setting before we enable it (and reshard all the buckets that need it in
> a
> > controlled way).
> > Yes, the bucket has versioned objects. We are in the process of deleting
> > it, but the deletion is now running since two days and the bucket index
> is
> > still very large.
> > I can try to do a bilog trim. Need to read up what it does and how to do
> it.
> > I could move the data to a new preshareded pool, but it feels like the
> > bucket is somehow broken, because deleting is now working as I expect it.
> >
> > I will try to reshard the bucket tonight and hope it will work out. The
> > explanation from Casey sounds promising.
> > As I have a lot more buckets with a lot more objects (according to the
> > bucket index) this needs to be done anyway.
> >
> > Cheers
> > Boris
> >
> > Am Fr., 9. Mai 2025 um 19:21 Uhr schrieb Anthony D'Atri <
> > [email protected]>:
> >
> >> Ensure the index pool is really on only SSDs. I’ve seen crush rules
> not
> >> specifying device class.
> >>
> >> Do you have autoresharding disabled? Versioned objects? Can you do a
> >> bilog trim? Could you preshard a new bucket and move the objects?
> >>
> >>> On May 9, 2025, at 12:54 PM, Boris <[email protected]> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have a bucket that got >20m index entries but only got 11 shards.
> >>>
> >>> When I try to run a radosgw-admin bucket check the PGs that hold the
> >> index start to become laggy after a couple of seconds. I need to stop it
> >> because it kills the whole object storage.
> >>> This is a latest reef cluster and the master of a multisite which only
> >> replicates the metadata (1 realm, multiple zonegroups, one zone per
> >> zonegroup).
> >>> Any ideas what I can do?
> >>> I fear to reshard the bucket, because I am not sure if I can stop the
> >> resharding if the PGs become laggy.
> >>> Cheers
> >>> Boris
> >>> _______________________________________________
> >>> ceph-users mailing list -- [email protected]
> >>> To unsubscribe send an email to [email protected]
> >
> --
> Enrico Bocchi
> CERN European Laboratory for Particle Physics
> IT - Storage & Data Management - General Storage Services
> Mailbox: G20500 - Office: 31-2-010
> 1211 Genève 23
> Switzerland
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]