This is an automated email from the ASF dual-hosted git repository.
sarvekshayr pushed a commit to branch HDDS-9225-website-v2
in repository https://gitbox.apache.org/repos/asf/ozone-site.git
The following commit(s) were added to refs/heads/HDDS-9225-website-v2 by this
push:
new a8100cff5 HDDS-11937. [Website v2] [Docs] [Core Concepts] Storage
Container Manager (#258)
a8100cff5 is described below
commit a8100cff5bba7eb475545fb526e95ee589b0656d
Author: Eric C. Ho <[email protected]>
AuthorDate: Tue Jan 20 13:54:12 2026 +0800
HDDS-11937. [Website v2] [Docs] [Core Concepts] Storage Container Manager
(#258)
---
.../03-storage-container-manager.md | 97 +++++++++++++++++++++
.../{03-s3-gateway.md => 04-s3-gateway.md} | 0
.../01-architecture/StorageContainerManager.png | Bin 0 -> 13336 bytes
3 files changed, 97 insertions(+)
diff --git
a/docs/03-core-concepts/01-architecture/03-storage-container-manager.md
b/docs/03-core-concepts/01-architecture/03-storage-container-manager.md
new file mode 100644
index 000000000..9e6951eb7
--- /dev/null
+++ b/docs/03-core-concepts/01-architecture/03-storage-container-manager.md
@@ -0,0 +1,97 @@
+---
+sidebar_label: Storage Container Manager
+---
+
+# Storage Container Manager
+
+Storage Container Manager (SCM) is the leader node of the *block space
management*.
+The main responsibility is to create and manage
[containers](../02-replication/01-storage-containers.md) which is the main
replication unit of Ozone.
+
+
+
+## Main responsibilities
+
+Storage Container Manager provides multiple critical functions for the Ozone
cluster.
+SCM acts as the cluster manager, Certificate authority, Block manager and the
Replica manager.
+
+SCM is in charge of creating an Ozone cluster. When an SCM is booted up via
`init` command, SCM creates the cluster identity and root certificates needed
for the SCM certificate authority. SCM manages the life cycle of a data node in
the cluster.
+
+1. SCM is the block manager. SCM allocates blocks and assigns them to data
nodes. Clients read and write these blocks directly.
+
+2. SCM keeps track of all the block replicas. If there is a loss of data node
or a disk, SCM detects it and instructs data nodes to make copies of the
missing blocks to ensure high availability.
+
+3. **SCM's Certificate Authority** is in
+charge of issuing identity certificates for each and every
+service in the cluster. This certificate infrastructure makes
+it easy to enable mTLS at network layer and the block
+token infrastructure depends on this certificate infrastructure.
+
+## Main components
+
+For a detailed view of Storage Container Manager this section gives a quick
overview about the provided network services and the stored persisted data.
+
+### Network services provided by Storage Container Manager
+
+- Pipelines: List/Delete/Activate/Deactivate
+ - Pipelines are set of Datanodes to form replication groups
+ - Raft groups are planned by SCM
+- Containers: Create / List / Delete containers
+- Admin related requests
+- Safemode status/modification
+- Replication manager start / stop
+- CA authority service
+- Required by other sever components
+- Datanode HeartBeat protocol
+ - From Datanode to SCM (30 sec by default)
+ - Datanodes report the status of containers, node...
+ - SCM can add commands to the response
+
+:::note
+Client doesn't connect directly to the SCM.
+:::
+
+### Persisted state
+
+The following data is persisted in Storage Container Manager side in a
specific RocksDB directory
+
+- Pipelines
+ - Replication group of servers. Maintained to find a group for new
container/block allocations.
+- Containers
+ - Containers are the replication units. Data is required to act in case of
data under/over replicated.
+- Deleted blocks
+ - Block data is deleted in the background. Need a list to follow the
progress.
+- Valid cert
+- Used by the internal Certificate Authority to authorize other Ozone services
+
+## Safe Mode
+
+SCM (Storage Container Manager) enters safe mode on startup. This is a
protective state that allows the system to become stable before it becomes
fully operational. During safe mode, certain operations like block allocation
are restricted.
+
+### How to Exit Safe Mode
+
+There are two ways to exit safe mode:
+
+1. **Automatic Exit:** SCM will automatically exit safe mode when a set of
predefined `SafeModeExitRule`s are satisfied. These rules ensure that the
cluster is in a healthy state. The primary rules are:
+ - **`DataNodeSafeModeRule`**: Checks if a minimum number of Datanodes have
registered with the SCM. This is configured by `hdds.scm.safemode.min.datanode`
(default: `3`).
+ - **`RatisContainerSafeModeRule`**: Checks if a certain percentage of
containers with at least one replica reported are available. This is configured
by `hdds.scm.safemode.threshold.pct` (default: `0.99`).
+ - **`HealthyPipelineSafeModeRule`**: Checks if a certain percentage of
pipelines are healthy. This is configured by
`hdds.scm.safemode.healthy.pipeline.pct` (default: `0.10`).
+ - **`OneReplicaPipelineSafeModeRule`**: Checks if a certain percentage of
pipelines have at least one replica reported. This is configured by
`hdds.scm.safemode.atleast.one.node.reported.pipeline.pct` (default: `0.90`).
+ - **`ECContainerSafeModeRule`**: Checks if a certain percentage of erasure
coded block groups are healthy. This is also configured by
`hdds.scm.safemode.threshold.pct` (default: `0.99`).
+
+2. **Manual Exit:** You can force SCM to exit safe mode using the `ozone admin
safemode --force-exit` command.
+
+### Safe Mode Pre-Check
+
+There's also a "pre-check" phase. SCM will not exit safe mode until all
pre-check rules are satisfied.
+The `DataNodeSafeModeRule` is a pre-check rule.
+This means that SCM will wait for a minimum number of Datanodes to be
available before it even considers the other conditions for exiting safe mode.
+
+## Notable configurations
+
+| Key | Default | Description |
+| --- | ------- | ----------- |
+| `ozone.scm.container.size` | 5GB | Default container size used by Ozone |
+| `ozone.scm.block.size` | 256MB | The default size of a data block. |
+| `hdds.scm.safemode.min.datanode` | 3 | Minimum number of Datanodes to start
the real work. |
+| `ozone.scm.http-address` | 0.0.0.0:9876 | HTTP address of the SCM server |
+| `ozone.metadata.dirs` | none | Directory to store persisted data (RocksDB). |
diff --git a/docs/03-core-concepts/01-architecture/03-s3-gateway.md
b/docs/03-core-concepts/01-architecture/04-s3-gateway.md
similarity index 100%
rename from docs/03-core-concepts/01-architecture/03-s3-gateway.md
rename to docs/03-core-concepts/01-architecture/04-s3-gateway.md
diff --git a/docs/03-core-concepts/01-architecture/StorageContainerManager.png
b/docs/03-core-concepts/01-architecture/StorageContainerManager.png
new file mode 100644
index 000000000..605c48c35
Binary files /dev/null and
b/docs/03-core-concepts/01-architecture/StorageContainerManager.png differ
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]