This is an automated email from the ASF dual-hosted git repository.

weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/master by this push:
     new 2ae8d6da28 HDDS-13379. Document SCM Safe Mode and its configuration 
properties. (#8737)
2ae8d6da28 is described below

commit 2ae8d6da2849286b044e9877a90fdd919c3653c4
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Wed Jul 9 18:16:20 2025 -0700

    HDDS-13379. Document SCM Safe Mode and its configuration properties. (#8737)
    
    Generated-by: Google Gemini 2.5 Pro/Flash, + Gemini Cli
---
 .../docs/content/concept/StorageContainerManager.md | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/hadoop-hdds/docs/content/concept/StorageContainerManager.md 
b/hadoop-hdds/docs/content/concept/StorageContainerManager.md
index 48d509016a..00aa7d0d38 100644
--- a/hadoop-hdds/docs/content/concept/StorageContainerManager.md
+++ b/hadoop-hdds/docs/content/concept/StorageContainerManager.md
@@ -87,6 +87,27 @@ The following data is persisted in Storage Container Manager 
side in a specific
  * Valid cert
   * Used by the internal Certificate Authority to authorize other Ozone 
services
 
+## Safe Mode
+
+SCM (Storage Container Manager) enters safe mode on startup. This is a 
protective state that allows the system to become stable before it becomes 
fully operational. During safe mode, certain operations like block allocation 
are restricted.
+
+### How to Exit Safe Mode
+
+There are two ways to exit safe mode:
+
+1.  **Automatic Exit:** SCM will automatically exit safe mode when a set of 
predefined `SafeModeExitRule`s are satisfied. These rules ensure that the 
cluster is in a healthy state. The primary rules are:
+    *   **`DataNodeSafeModeRule`**: Checks if a minimum number of DataNodes 
have registered with the SCM. This is configured by 
`hdds.scm.safemode.min.datanode` (default: `3`).
+    *   **`RatisContainerSafeModeRule`**: Checks if a certain percentage of 
containers with at least one replica reported are available. This is configured 
by `hdds.scm.safemode.threshold.pct` (default: `0.99`).
+    *   **`HealthyPipelineSafeModeRule`**: Checks if a certain percentage of 
pipelines are healthy. This is configured by 
`hdds.scm.safemode.healthy.pipeline.pct` (default: `0.10`).
+    *   **`OneReplicaPipelineSafeModeRule`**: Checks if a certain percentage 
of pipelines have at least one replica reported. This is configured by 
`hdds.scm.safemode.atleast.one.node.reported.pipeline.pct` (default: `0.90`).
+    *   **`ECContainerSafeModeRule`**: Checks if a certain percentage of 
erasure coded block groups are healthy. This is also configured by 
`hdds.scm.safemode.threshold.pct` (default: `0.99`).
+
+2.  **Manual Exit:** You can force SCM to exit safe mode using the `ozone 
admin safemode --force-exit` command.
+
+### Safe Mode Pre-Check
+
+There's also a "pre-check" phase. SCM will not exit safe mode until all 
pre-check rules are satisfied. The `DataNodeSafeModeRule` is a pre-check rule. 
This means that SCM will wait for a minimum number of DataNodes to be available 
before it even considers the other conditions for exiting safe mode.
+
 ## Notable configurations
 
 key | default | description 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to