This is an automated email from the ASF dual-hosted git repository.

sarvekshayr pushed a commit to branch HDDS-9225-website-v2
in repository https://gitbox.apache.org/repos/asf/ozone-site.git


The following commit(s) were added to refs/heads/HDDS-9225-website-v2 by this 
push:
     new b3e05f01e HDDS-14260. [Website v2] [Docs] [Administrator Guide] 
Balancing Data Among Datanodes (#212)
b3e05f01e is described below

commit b3e05f01e920e82e4c13885ca59fd037949ea314
Author: Eric C. Ho <[email protected]>
AuthorDate: Mon Jan 5 15:44:37 2026 +0800

    HDDS-14260. [Website v2] [Docs] [Administrator Guide] Balancing Data Among 
Datanodes (#212)
---
 .../03-operations/05-data-balancing.md             | 88 +++++++++++++++++++++-
 1 file changed, 87 insertions(+), 1 deletion(-)

diff --git a/docs/05-administrator-guide/03-operations/05-data-balancing.md 
b/docs/05-administrator-guide/03-operations/05-data-balancing.md
index 18e3aeba0..43231959b 100644
--- a/docs/05-administrator-guide/03-operations/05-data-balancing.md
+++ b/docs/05-administrator-guide/03-operations/05-data-balancing.md
@@ -4,4 +4,90 @@ sidebar_label: Data Balancing
 
 # Balancing Data Among Datanodes
 
-**TODO:** File a subtask under 
[HDDS-9859](https://issues.apache.org/jira/browse/HDDS-9859) and complete this 
page or section.
+## Overview
+
+The Container Balancer is a tool in Apache Ozone that balances data containers 
across the cluster.
+Its primary goal is to ensure an even distribution of data based on disk space 
usage on Datanodes.
+This helps to prevent some Datanodes from becoming full while others remain 
underutilized.
+
+The balancer operates by moving `CLOSED` container replicas, which means it 
doesn't interfere with active I/O operations.
+It is designed to work with both regular and Erasure Coded (EC) containers.
+To maintain cluster stability, the Container Balancer's startup is delayed 
after a Storage Container Manager (SCM) failover.
+
+## Command Line Usage
+
+The Container Balancer is managed through the `ozone admin containerbalancer` 
command.
+
+### Start
+
+To start the Container Balancer with default settings:
+
+```bash
+ozone admin containerbalancer start
+```
+
+You can also start the balancer with specific options:
+
+```bash
+ozone admin containerbalancer start [options]
+```
+
+**Options:**
+
+| Option | Description |
+| ------ | ----------- |
+| `-t`, `--threshold` | The percentage deviation from the average utilization 
of the cluster after which a Datanode will be rebalanced. Default is 10%. |
+| `-i`, `--iterations` | The maximum number of consecutive iterations the 
balancer will run for. Default is 10. Use -1 for infinite iterations. |
+| `-d`, `--maxDatanodesPercentageToInvolvePerIteration` | The maximum 
percentage of healthy, in-service Datanodes that can be involved in balancing 
in one iteration. Default is 20%. |
+| `-s`, `--maxSizeToMovePerIterationInGB` | The maximum size of data in GB to 
be moved in one iteration. Default is 500GB. |
+| `-e`, `--maxSizeEnteringTargetInGB` | The maximum size in GB that can enter 
a target Datanode in one iteration. Default is 26GB. |
+| `-l`, `--maxSizeLeavingSourceInGB` | The maximum size in GB that can leave a 
source Datanode in one iteration. Default is 26GB. |
+| `--balancing-iteration-interval-minutes` | The interval in minutes between 
each iteration of the Container Balancer. Default is 70 minutes. |
+| `--move-timeout-minutes` | The time in minutes to allow a single container 
to move from source to target. Default is 65 minutes. |
+| `--move-replication-timeout-minutes` | The time in minutes to allow a single 
container's replication from source to target as part of a container move. 
Default is 50 minutes. |
+| `--move-network-topology-enable` | Whether to consider network topology when 
selecting a target for a source. Default is false. |
+| `--include-datanodes` | A comma-separated list of Datanode hostnames or IP 
addresses to be included in balancing. |
+| `--exclude-datanodes` | A comma-separated list of Datanode hostnames or IP 
addresses to be excluded from balancing. |
+
+### Status
+
+To check the status of the Container Balancer:
+
+```bash
+ozone admin containerbalancer status
+```
+
+To get a more detailed status, including the history of iterations:
+
+```bash
+ozone admin containerbalancer status -v --history
+```
+
+### Stop
+
+To stop the Container Balancer:
+
+```bash
+ozone admin containerbalancer stop
+```
+
+## Configuration
+
+The Container Balancer can also be configured through the `ozone-site.xml` 
file.
+
+| Property | Description | Default Value |
+| -------- | ----------- | ------------- |
+| `hdds.container.balancer.utilization.threshold` | A cluster is considered 
balanced if for each Datanode, the utilization of the Datanode differs from the 
utilization of the cluster no more than this threshold. | 10% |
+| `hdds.container.balancer.datanodes.involved.max.percentage.per.iteration` | 
Maximum percentage of healthy, in-service Datanodes that can be involved in 
balancing in one iteration. | 20% |
+| `hdds.container.balancer.size.moved.max.per.iteration` | The maximum size of 
data that will be moved by Container Balancer in one iteration. | 500GB |
+| `hdds.container.balancer.size.entering.target.max` | The maximum size that 
can enter a target Datanode in each iteration. | 26GB |
+| `hdds.container.balancer.size.leaving.source.max` | The maximum size that 
can leave a source Datanode in each iteration. | 26GB |
+| `hdds.container.balancer.iterations` | The number of iterations that 
Container Balancer will run for. | 10 |
+| `hdds.container.balancer.exclude.containers` | A comma-separated list of 
container IDs to exclude from balancing. | "" |
+| `hdds.container.balancer.move.timeout` | The amount of time to allow a 
single container to move from source to target. | 65m |
+| `hdds.container.balancer.move.replication.timeout` | The amount of time to 
allow a single container's replication from source to target as part of a 
container move. | 50m |
+| `hdds.container.balancer.balancing.iteration.interval` | The interval period 
between each iteration of Container Balancer. | 70m |
+| `hdds.container.balancer.include.datanodes` | A comma-separated list of 
Datanode hostnames or IP addresses. Only the Datanodes specified in this list 
are balanced. | "" |
+| `hdds.container.balancer.exclude.datanodes` | A comma-separated list of 
Datanode hostnames or IP addresses. The Datanodes specified in this list are 
excluded from balancing. | "" |
+| `hdds.container.balancer.move.networkTopology.enable` | Whether to take 
network topology into account when selecting a target for a source. | false |
+| `hdds.container.balancer.trigger.du.before.move.enable` | Whether to send a 
command to all healthy and in-service data nodes to run `du` immediately before 
starting a balance iteration. | false |


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to