Hi

Dmitry Smirnov <only...@debian.org> writes:

> Control: tags -1 unreproducible
>
> On Sun, 16 Nov 2014 16:17:27 Gaudenz Steinlin wrote:
>> >  * Set 'hashpspool' flag on your pools (new default):
>> >     ceph osd pool set {pool} hashpspool true
>> But on the other hand I could not find any information about why this
>> should be run on upgrades. The documentation for this is very sparse.
>> Dimitry do you know what sort of problems this command solves and why it
>> should be run?
>
> Running this command is not mandatory but since it affects distribution of 
> data IMHO it make sense to set "hashpspool" just like you would adjust 
> tunables after upgrade. New pools are created with "hashpspool" by default so 
> I believe it just makes sense to update configuration of old pools.

I agree that it makes sense to upgrade old pools. However there are some
downsides to this. Issueing this command immediately makes the cluster
unclean and leads to a degraded state. This command changes the
algorithm to distribute PGs to OSDs. The cluster then starts
backfilling to have a correct placement of all PGs again. This can
create quite substantial IO load on a cluster which you probably want to
plan for on a highly loaded production cluster. I guess this is also the
reason this command is not mentioned at all in the ceph.com upgrade
guide. Maybe someone from ceph.com can shed some more light on this.

I suggest the attached patch to the README.Debian text. If you agree I
will commit that change.

Gaudenz
diff --git a/debian/README.Debian b/debian/README.Debian
index f4ba80a..79b1273 100644
--- a/debian/README.Debian
+++ b/debian/README.Debian
@@ -43,10 +43,17 @@
 
  * (Restart MDSes).
 
- * Set 'hashpspool' flag on your pools (new default):
+ * Consider setting the 'hashpspool' flag on your pools (new default):
 
     ceph osd pool set {pool} hashpspool true
 
+    This changes the pool to use a new hashing algorithm for the distribution of
+    Placement Groups (PGs) to OSDs. This new algorithm ensures a better distribution
+    to all OSDs. Be aware that this change will temporarly put your cluster into a
+    degraded state and cause additional I/O until all PGs are moved to their new
+    location. See http://tracker.ceph.com/issues/4128 for the details about the new
+    algorithm.
+
  Read more about tunables in
 
     http://ceph.com/docs/master/rados/operations/crush-map/#tunables

Reply via email to