This is an automated email from the ASF dual-hosted git repository.

weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git


The following commit(s) were added to refs/heads/master by this push:
     new 65f0c09a46 HDDS-13381. Docs: Add user documentation for Volumes, 
Buckets, and Keys (#8739)
65f0c09a46 is described below

commit 65f0c09a4644505e793ccb2c6cdd7845af6cfc50
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Sat Jul 5 08:18:07 2025 -0700

    HDDS-13381. Docs: Add user documentation for Volumes, Buckets, and Keys 
(#8739)
    
    Generated-by: Google Gemini Cli + Gemini 2.5 Flash.
---
 .../docs/content/concept/VolumesBucketsKeys.md     | 183 +++++++++++++++++++++
 1 file changed, 183 insertions(+)

diff --git a/hadoop-hdds/docs/content/concept/VolumesBucketsKeys.md 
b/hadoop-hdds/docs/content/concept/VolumesBucketsKeys.md
new file mode 100644
index 0000000000..ef3dcadef4
--- /dev/null
+++ b/hadoop-hdds/docs/content/concept/VolumesBucketsKeys.md
@@ -0,0 +1,183 @@
+---
+title: "Volumes, Buckets, and Keys"
+date: "2025-07-03"
+menu:
+  main:
+    parent: "Ozone Manager"
+summary: "Understanding the fundamental data hierarchy in Apache Ozone: 
Volumes, Buckets, and Keys."
+---
+
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+Apache Ozone organizes data in a three-level hierarchy: Volumes, Buckets, and 
Keys. This structure provides a flexible and scalable way to manage large 
datasets, similar to how traditional file systems use directories and files, 
but optimized for object storage.
+
+## Overview of the Hierarchy
+
+*   **Volumes:** The top-level organizational unit, akin to user accounts or 
home directories.
+*   **Buckets:** Reside within volumes, similar to directories or folders, and 
contain the actual data objects.
+*   **Keys:** The fundamental data objects, analogous to files, stored inside 
buckets.
+
+```
+Volume
+└─── Bucket
+    ├─── Key 1
+    ├─── Key 2
+    └─── ...
+```
+
+This hierarchy is managed by the [Ozone Manager]({{< ref "OzoneManager.md" 
>}}), which is the principal namespace service of Ozone.
+
+## Volumes
+
+### What is a Volume?
+
+A **Volume** in Ozone is the highest level of the namespace hierarchy. It 
serves as a logical container for one or more buckets. Conceptually, a volume 
can be thought of as a user's home directory or a project space, providing a 
clear separation of data ownership and management.
+
+**Key Characteristics:**
+*   **Administrative Control:** Only administrators can create or delete 
volumes. This ensures proper resource allocation and access control at the 
highest level.
+*   **Storage Accounting:** Volumes are used as the basis for storage 
accounting, allowing administrators to track resource usage per volume.
+*   **Container for Buckets:** A volume can contain any number of buckets.
+
+### Details
+
+#### Creation and Management
+Volumes are typically created and managed using the Ozone command-line 
interface (CLI). For example:
+```bash
+ozone sh volume create /myvolume
+```
+For more details on volume operations, refer to the [Ozone CLI 
documentation]({{< ref "Cli.md" >}}#volume-operations).
+
+#### Quota Management
+Volumes can have quotas applied to them, limiting the total storage space or 
the number of namespaces (buckets) they can consume. This is crucial for 
multi-tenant environments to prevent any single user or project from 
monopolizing resources.
+*   **Storage Space Quota:** Limits the total data size within the volume.
+*   **Namespace Quota:** Limits the number of buckets that can be created 
within the volume.
+
+For comprehensive information on configuring and managing quotas, see the 
[Quota Management documentation]({{< ref "Quota.md" >}}).
+
+#### Access Control Lists (ACLs)
+Access to volumes is controlled via ACLs, which define permissions for users 
and groups. These permissions determine who can create buckets within a volume, 
list its contents, or perform other operations.
+*   **Create:** Allows creating buckets within the volume.
+*   **List:** Allows listing buckets within the volume.
+*   **Read:** Allows reading metadata of the volume.
+*   **Write:** Allows writing metadata of the volume.
+*   **Delete:** Allows deleting the volume (if empty or recursively).
+
+ACLs can be set and managed using the Ozone CLI. Refer to the [Security ACLs 
documentation]({{< ref "SecurityAcls.md" >}}) for more in-depth information.
+
+#### S3 Gateway Integration (`/s3v` Volume)
+For compatibility with the S3 API, Ozone uses a special volume, typically 
`/s3v`. By default, all buckets accessed via the S3 interface are stored under 
this volume. It's also possible to expose buckets from other Ozone volumes via 
the S3 interface using "bucket linking."
+For more details, refer to the [S3 Protocol documentation]({{< ref "S3.md" 
>}}) and [S3 Multi-Tenancy documentation]({{< ref "feature/S3-Multi-Tenancy.md" 
>}}).
+
+#### DataNode Physical Volumes vs. Ozone Manager Logical Volumes
+It's important to distinguish between the logical "volumes" managed by the 
Ozone Manager (as described above) and the physical "volumes" (disks) managed 
by the DataNodes.
+*   **Ozone Manager Volumes:** Logical namespace containers for buckets and 
keys.
+*   **DataNode Volumes:** Physical storage devices (disks) on a DataNode where 
actual data blocks are stored in containers.
+For more information on DataNode volume management, refer to the [DataNodes 
documentation]({{< ref "Datanodes.md" >}}).
+
+## Buckets
+
+### What is a Bucket?
+
+A **Bucket** is the second level in the Ozone data hierarchy, residing within 
a volume. Buckets are analogous to directories or folders in a traditional file 
system. They serve as containers for keys (data objects).
+
+**Key Characteristics:**
+*   **Contained within Volumes:** Every bucket must belong to a volume.
+*   **Container for Keys:** A bucket can contain any number of keys.
+*   **No Nested Buckets:** Unlike directories, buckets cannot contain other 
buckets.
+
+### Details
+
+#### Creation and Management
+Buckets are created within a specified volume.
+```bash
+ozone sh bucket create /myvolume/mybucket
+```
+For more details on bucket operations, refer to the [Ozone CLI 
documentation]({{< ref "Cli.md" >}}#bucket-operations).
+
+#### Bucket Layouts (Object Store vs. File System Optimized)
+Ozone supports different bucket layouts, primarily:
+*   **Object Store (OBS):** The traditional object storage layout, where keys 
are stored with their full path names. This is suitable for S3-like access 
patterns.
+*   **File System Optimized (FSO):** An optimized layout for Hadoop Compatible 
File System (HCFS) semantics, where intermediate directories are stored 
separately, improving performance for file system operations like listing and 
renaming.
+For more details, refer to the [Prefix FSO documentation]({{< ref 
"feature/PrefixFSO.md" >}}).
+
+#### Encryption (Transparent Data Encryption - TDE)
+Buckets can be configured for Transparent Data Encryption (TDE) at the time of 
creation. When TDE is enabled, all data written to the bucket is automatically 
encrypted at rest using a specified encryption key.
+For detailed steps on setting up and using TDE, refer to the [Securing TDE 
documentation]({{< ref "SecuringTDE.md" >}}).
+
+#### Erasure Coding
+Erasure Coding (EC) can be enabled at the bucket level to define data 
redundancy strategies. This allows for more efficient storage compared to 
replication, especially for large datasets.
+For more information, see the [Erasure Coding documentation]({{< ref 
"feature/ErasureCoding.md" >}}).
+
+#### Snapshots
+Ozone's snapshot feature allows users to take point-in-time consistent images 
of a given bucket. These snapshots are immutable and can be used for backup, 
recovery, archival, and incremental replication purposes.
+For more details, refer to the [Ozone Snapshot documentation]({{< ref 
"feature/Snapshot.md" >}}).
+
+#### GDPR Compliance
+Ozone provides features to support GDPR compliance, particularly the "right to 
be forgotten." When a GDPR-compliant bucket is created, encryption keys for 
deleted data are immediately removed, making the data unreadable even if the 
underlying blocks haven't been physically purged yet.
+For more details, refer to the [GDPR documentation]({{< ref "security/GDPR.md" 
>}}).
+
+#### Bucket Linking
+Bucket linking allows exposing a bucket from one volume (or even another 
bucket) as if it were in a different location, particularly useful for S3 
compatibility or cross-tenant access. This creates a symbolic link-like 
behavior.
+For more information, see the [S3 Protocol documentation]({{< ref "S3.md" >}}) 
and [S3 Multi-Tenancy documentation]({{< ref "feature/S3-Multi-Tenancy.md" >}}).
+
+#### Quota Management
+Similar to volumes, buckets can also have storage space and namespace quotas 
applied to them.
+For comprehensive information on configuring and managing quotas, see the 
[Quota Management documentation]({{< ref "Quota.md" >}}).
+
+#### Access Control Lists (ACLs)
+ACLs define permissions for buckets, controlling who can list keys, read/write 
data, or delete the bucket.
+For more details, refer to the [Security ACLs documentation]({{< ref 
"SecurityAcls.md" >}}).
+
+## Keys
+
+### What is a Key?
+
+A **Key** is the fundamental data object in Ozone, analogous to a file in a 
traditional file system. Keys are stored within buckets and represent the 
actual data that users interact with.
+
+**Key Characteristics:**
+*   **Contained within Buckets:** Every key must reside within a bucket.
+*   **Immutable Data Blocks:** Once written, the underlying data blocks of a 
key are immutable. Updates or modifications to a key typically result in new 
versions or new data blocks being written, with the metadata pointing to the 
latest version.
+
+### Details
+
+#### Creation, Reading, and Management
+Keys are created, read, and managed using the Ozone CLI or various client APIs 
(Java, S3, etc.).
+```bash
+ozone sh key put /myvolume/mybucket/mykey.txt /path/to/local/file.txt
+```
+For more details on key operations, refer to the [Ozone CLI documentation]({{< 
ref "Cli.md" >}}#key-operations).
+
+#### Key Write and Read Process
+When a client writes a key, the Ozone Manager handles the metadata (key name, 
location of data blocks), and the DataNodes store the actual data blocks. For 
reads, the Ozone Manager provides the client with the locations of the data 
blocks, which the client then retrieves directly from the DataNodes.
+For a deeper dive into the key write and read process, refer to the [Ozone 
Manager documentation]({{< ref "OzoneManager.md" >}}).
+
+#### Atomic Key Replacement
+Ozone supports atomic key replacement, ensuring that a key is only overwritten 
if it hasn't changed since it was last read. This prevents lost updates in 
concurrent write scenarios.
+For more details, refer to the [Overwriting Key Only If Unchanged design 
document]({{< ref "design/overwrite-key-only-if-unchanged.md" >}}).
+
+#### Trash
+When keys are deleted from File System Optimized (FSO) buckets, they are moved 
to a trash directory, allowing for recovery. For Object Store (OBS) buckets, 
keys are permanently deleted.
+For more information on the trash feature, refer to the [Trash 
documentation]({{< ref "feature/Trash.md" >}}).
+
+#### Encryption
+If the parent bucket is encrypted, all keys written to that bucket will be 
transparently encrypted.
+For more details, refer to the [Securing TDE documentation]({{< ref 
"SecuringTDE.md" >}}).
+
+#### Access Control Lists (ACLs)
+ACLs can also be applied to individual keys, providing fine-grained control 
over read and write permissions.
+For more details, refer to the [Security ACLs documentation]({{< ref 
"SecurityAcls.md" >}}).


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to