These options can be confusing.  I’ll wax didactic here for future Cephers 
searching the archives.

> I wanted to increase the number of PG per OSD

Many clusters should ;)

> ....and did so by using
> 
> ceph config set global mon_target_pg_per_osd 800
opn
I kinda wish this had a different name, like pg_autoscaler_target_pg_per_osd.

This value affects the autoscaler’s target, so it only comes into play for 
pools where the autoscaler is not disabled.  It acts IIRC as an upper bound, 
and the per-pool pg_num values are calculated by the autoscaler accordingly.  
Each pool’s pg_num should be a power of 2, and the autoscaler applies a bit of 
additional conservatism to, I suspect, avoid flapping between two values.  Like 
my father’s 1976 Ford Elite would gear-hunt climbing hills.

That said, 800 is higher than I would suggest for mostly any OSD (mostly).  I 
would suggest a value there of, say, 300-400 and you’ll end up with like 
200-300 ish.  If you aren’t using the autoscaler this doesn’t matter, but I 
would suggest raising it accordingly Just In Case one day you do, or you create 
new pools that do — you’d want the behavior to approximate that of your manual 
efforts.

I usually recommend either entirely disabling the autoscaler or going all-in 
for all pools.

> Although the OSD config has the new value , I am still unable to create pools 
> that end up creating more than 250PG per OSD 

That guardrail is mon_max_pg_per_osd.  Read the message more closely, I suspect 
you saw mon_max_pg_per_osd in there and thought mon_target_pg_per_osd.  If so, 
you’re not the first to make that mistake, especially before I revamped the 
wording of that message a couple years back.

Remember too that this figure aka the pg ratio is per-pool, not 
per-pool-per-osd.  So when you have multiple pools that use the same OSDs, they 
all contribute to the sum.  This is the PGS value at right of “ceph osd df” 
output.

For both of these options, check that you don’t have an existing value at a 
different scope, which can lead to unexpected results.

ceph config dump | grep pg_per_osd

In many cases it’s safe and convenient to set options at global scope instead 
of mon, osd, etc (the “who” field) as some need to be set at scopes that are 
not intuitive.  If you need to set an option to different values for different 
hosts, daemon types, specific OSDs, etc. then the more-granular scopes are 
useful.  And rgw options are “client”.

So think of mon_max_pg_per_osd as a guardrail or failsafe that prevents you 
from accidentally typing too many zeros, which can have various plusungood 
results.  I personally set it to 1000, you might choose a more conservative 
value of say 600.  Subtly, this can come into play when you lose a host and the 
cluster recovers PGs onto other OSDs, which can result in them crossing the 
threshold and failing to activate.  If your cluster has OSDs of significantly 
varying weight (size), this effect is especially possible, and you scratch your 
head wondering why on earth the PGs won’t activate.

— aad


> 
> I have restarted the OSDs ...and the monitors ...
> 
> Any ideas or suggestions for properly applying the change would be 
> appreciated 
> 
> Steven
> 
> 
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to