[ceph-users] Re: PG scaling questions

胡玮文 Tue, 03 Aug 2021 06:52:06 -0700

在 2021年8月3日，21:32，Gabriel Tzagkarakis <[email protected]> 写道：



hi , thank you for replying

Does this method refer to manually setting the number of placement groups while 
keeping autoscale_mode setting off ?
Also from what i can see from the documentation the  target_max_misplaced_ratio 
 implies using the balancer feature, which I am currently not using

I believe this “auto pgp_num increasing” feature works independently from 
autoscaler and balancer. When the last time I increase pg_num to 1024, I have 
autoscale mode set to warn, and balancer off. I recommend you to read this 
blog. 
https://ceph.io/en/news/blog/2019/new-in-nautilus-pg-merging-and-autotuning/ 
Specifically, near “Starting in Nautilus, this second step is no longer 
necessary: …”

And target_max_misplaced_ratio is not only used in balancer, but also used in 
this feature.

If I understood correctly the existing PGs will be split in place and act as 
primary for the backfills that will be required to distribute the data evenly 
to all osds

Can i use the manual way to increase slowly pgp in the pool end when my PGs 
have a more manageable size i will enable the balancer.

will there be a considerable amount of downtime splitting pgs and peering ?

I didn’t observe any significant downtime the last time I did this. I think it 
is several seconds at most.

I'm sorry for asking too many questions , i'm trying not to break stuff :)

On Tue, Aug 3, 2021 at 3:46 PM 胡 玮文 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

Each placement group will get split in 4 pieces in-place all at nearly the same 
time, no empty pgs will be created.

Normally, you only set pg_num, but do not touch pgp_num. Instead, you can set 
“target_max_misplaced_ratio” (default 5%). Then mgr will increase pgp_num for 
you. It will increase pgp_num so that some pg get placed into another OSD, 
until misplaced ratio reached target. Then it wait for some backfilling to 
finish before increasing pgp_num again. (This behavior seems to be introduced 
in Nautilus)

So I don’t think you need to worry about full OSDs. “backfillfull ratio” should 
throttling backfill when OSD is nearly full, which in turn will throttling 
pgp_num increase.

发件人: Gabriel Tzagkarakis<mailto:[email protected]>
发送时间: 2021年8月3日 19:42
收件人: [email protected]<mailto:[email protected]>
主题: [ceph-users] PG scaling questions

hello everyone,

I would like to know how does the autoscale or manual scaling actually
works to prevent my
cluster from running out of disk space.

Let's say i want to scale a pool of 8 PGs each ~400Gb to 32 PGs.

1) does each placement group get split in 4 pieces IN-PLACE all at the same
time ?
2) does autoscaling choose one of the existing random placement groups for
example X.Y and
 creates new empty placement groups and migrates data upon them and then
continues to the next big PG with or without deleting the original PG?
3) something else ?

I am more concerned about the time period when both the
initial/pre-existing PGs and the newly created ones co-exist in the cluster
to prevent full osds. In my case each pg has many small files and deleting
stray pgs takes a long time.

Would it be better if i used something like
ceph osd pool set default.rgw.buckets.data pg_num 32
and then increase pgp_num in increments of 8 assuming one of the original
PGs is affected at a time. But my assumption may be wrong again

I could not find something relevant in the documentation

Thank you
_______________________________________________
ceph-users mailing list -- [email protected]<mailto:[email protected]>
To unsubscribe send an email to 
[email protected]<mailto:[email protected]>

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: PG scaling questions

Reply via email to