This is an automated email from the ASF dual-hosted git repository.

weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/master by this push:
     new 5e72dd4e1e HDDS-13208. [Docs] Add volume management section under 
Architecture/Datanodes. (#8585)
5e72dd4e1e is described below

commit 5e72dd4e1efba98955300ff831311eb7fae6c0e9
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Thu Jul 3 03:05:33 2025 -0700

    HDDS-13208. [Docs] Add volume management section under 
Architecture/Datanodes. (#8585)
    
    Co-authored-by: gemini-code-assist[bot] 
<176961590+gemini-code-assist[bot]@users.noreply.github.com>
    Co-authored-by: Chung En Lee <[email protected]>
    
    Generated-by: Copilot Agent (Preview)
---
 hadoop-hdds/docs/content/concept/Datanodes.md | 34 +++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/hadoop-hdds/docs/content/concept/Datanodes.md 
b/hadoop-hdds/docs/content/concept/Datanodes.md
index cf246712f6..435ab588f8 100644
--- a/hadoop-hdds/docs/content/concept/Datanodes.md
+++ b/hadoop-hdds/docs/content/concept/Datanodes.md
@@ -77,6 +77,40 @@ This extra indirection helps tremendously with scaling 
Ozone. SCM has far
 less block data to process and the namespace service (Ozone Manager) as a
 different service are critical to scaling Ozone.
 
+## Data Volume Management
+
+### What is a Volume?
+
+In the context of an Ozone DataNode, a "volume" refers to a physical disk or 
storage device managed by the DataNode. Each volume can store many containers, 
which are the fundamental units of storage in Ozone. This is different from the 
"volume" concept in Ozone Manager, which refers to a namespace for organizing 
buckets and keys.
+
+The status of volumes, including used space, available space and whether or 
not they are operational (healthy) or failed, can be looked up from DataNode 
Web UI.
+
+### Defining Volumes with hdds.datanode.dir
+
+The property `hdds.datanode.dir` defines the set of volumes (disks) managed by 
a DataNode. You can specify one or more directories, separated by commas. Each 
directory represents a volume.
+For example: `/data1/disk1,/data2/disk2`, which configures the DataNode to 
manage two volumes.
+
+### Volume Choosing Policy
+
+When a DataNode needs to select a volume to store new data, it uses a volume 
choosing policy. The policy is controlled by the property 
`hdds.datanode.volume.choosing.policy`. There are two main policies:
+
+- **CapacityVolumeChoosingPolicy (default):**
+  This policy randomly selects two volumes with enough available space and 
chooses the one with lower utilization (i.e., more free space). This approach 
increases the likelihood that less-used disks are chosen, helping to balance 
disk usage over time.
+
+- **RoundRobinVolumeChoosingPolicy:**
+  This policy selects volumes in a round-robin order, cycling through all 
available volumes. It does not consider the current utilization of each disk, 
but ensures even distribution of new containers across all disks.
+
+### Volume-Related Configuration Properties
+
+| Property Name                                 | Default Value                
| Description                                                                   
               |
+|-----------------------------------------------|------------------------------|----------------------------------------------------------------------------------------------|
+| hdds.datanode.volume.choosing.policy          | CapacityVolumeChoosingPolicy 
| The policy used to select a volume for new containers.                        
               |
+| hdds.datanode.volume.min.free.space           | 20GB                         
| Minimum free space required on a volume to be eligible for new containers.    
               |
+| hdds.datanode.volume.min.free.space.percent   | 0.001                        
| Minimum free space percentage required on a volume to be eligible for new 
containers.        |
+
+### Disk Balancer
+
+Over time, operations like adding or replacing disks can cause uneven disk 
usage. The Ozone community is developing a Disk Balancer (see 
[HDDS-5713](https://issues.apache.org/jira/browse/HDDS-5713)) to automatically 
balance disk usage across DataNode volumes. This feature is under active 
development.
 
 ## Notable configurations
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to