This is an automated email from the ASF dual-hosted git repository.
weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git
The following commit(s) were added to refs/heads/master by this push:
new d15e8a6347 HDDS-13378. [Docs] Add a Production page under Getting
Started (#8734)
d15e8a6347 is described below
commit d15e8a6347faccdc2df96db76dbe404eada360d8
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Mon Jul 14 07:43:24 2025 -0700
HDDS-13378. [Docs] Add a Production page under Getting Started (#8734)
Generated-by: Google Gemini Cli with Gemini 2.5 Pro.
---
.../docs/content/start/ProductionDeployment.md | 89 ++++++++++++++++++++++
.../docs/content/start/ProductionDeployment.zh.md | 69 +++++++++++++++++
2 files changed, 158 insertions(+)
diff --git a/hadoop-hdds/docs/content/start/ProductionDeployment.md
b/hadoop-hdds/docs/content/start/ProductionDeployment.md
new file mode 100644
index 0000000000..ed24a7b267
--- /dev/null
+++ b/hadoop-hdds/docs/content/start/ProductionDeployment.md
@@ -0,0 +1,89 @@
+---
+title: Production Deployment
+weight: 6
+menu:
+ main:
+ parent: Getting Started
+---
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+This document provides guidance on the requirements and best practices for a
production deployment of Apache Ozone.
+
+## Ozone Components
+
+A typical production Ozone cluster includes the following services:
+
+* **Ozone Manager (OM)**: Manages the namespace and metadata of the Ozone
cluster. A production cluster requires 3 OM instances for high availability.
+* **Storage Container Manager (SCM)**: Manages the data nodes and pipelines.
A production cluster requires 3 SCM instances for high availability.
+* **DataNode**: Stores the actual data in containers. A production cluster
requires at least 3 DataNodes.
+* **Recon**: A web-based UI for monitoring and managing the Ozone cluster. A
Recon server is strongly recommended, though not required.
+* **S3 Gateway (S3G)**: An S3-compatible gateway for accessing Ozone.
Multiple S3 Gateway instances are strongly recommended to load balance S3
traffic.
+* **HttpFs**: An HDFS-compatible API for accessing Ozone. This is an
optional component.
+
+## Requirements
+
+### System Requirements
+
+* **Hardware**: Bare metal machines are recommended for optimal performance.
Virtual machines or containers are not recommended for production deployments.
+* **Operating System**: Linux (recommended distributions: Red Hat 8/Rocky
8+, Ubuntu, SUSE; supported architectures: x86/ARM).
+* **Java Development Kit (JDK)**: Version 8 or higher.
+* **Time Synchronization**: A time synchronization service such as Chrony or
ntpd must be enabled to prevent time drift.
+
+### Memory Requirements
+
+* **Ozone Manager (OM), Storage Container Manager (SCM), and Recon**:
Recommended heap size in large production clusters is 64GB.
+* **DataNode, S3 Gateway, and HttpFs**: Recommended heap size is 31GB.
+
+### Storage Requirements
+
+* **Ozone Manager (OM), Storage Container Manager (SCM), and Recon Metadata
Storage**: Use SAS SSD or NVMe SSD for metadata (RocksDB and Ratis) to ensure
optimal performance. It is recommended to use RAID 1 (disk mirroring) for the
metadata disks to protect against disk failures.
+* **DataNode Storage**:
+ * **Ratis Log**: Use SAS SSD or NVMe SSD for the Ratis log directory for
low latency writes.
+ * **Container Data**: Hard disks are acceptable for container data
storage.
+ * **Disk Configuration**: It is recommended to use a JBOD (Just a Bunch
Of Disks) configuration instead of RAID. Ozone is a replicated distributed
storage system and handles data redundancy. Using RAID can decrease performance
without providing additional data protection benefits.
+* **Storage Type**: Use direct-attached storage. Do not use Network Attached
Storage (NAS) or Storage Area Network (SAN).
+
+### Network Requirements
+
+* **Network Bandwidth**: A minimum of 25Gbps network card bandwidth is
recommended.
+* **Network Topology**: A leaf-spine network topology with an
oversubscription ratio below 3:1 is recommended for predictable performance.
+
+### Security Requirements (Optional but Recommended)
+
+* **Kerberos**: A Kerberos environment, including a Key Distribution Center
(KDC), is recommended for enhanced security.
+
+## Recommended Configurations
+
+### Linux Kernel
+
+* **CPU Governor**: Set the CPU scaling driver to `performance` mode to
maximize performance.
+* **Transparent Hugepage**: Disable Transparent Hugepage to avoid
performance issues.
+* **SELinux**: Disable SELinux.
+* **Swappiness**: Set `vm.swappiness=1` to minimize swapping.
+
+### Local File System
+
+* **LVM**: Disable Logical Volume Manager (LVM) for data drives.
+* **File System**: Use `ext4` or `xfs` file systems.
+* **Mount Options**: Mount drives with the `noatime` option to reduce
unnecessary disk writes. For SSDs, also add the `discard` option.
+
+### Ozone Configuration
+
+* **Monitoring**: Install Prometheus and Grafana for monitoring the Ozone
cluster. For audit logs, consider using a log ingestion framework such as the
ELK Stack (Elasticsearch, Logstash, and Kibana) with FileBeat, or other similar
frameworks. Alternatively, you can use Apache Ranger to manage audit logs.
+* **Pipeline Limits**: Increase the number of allowed write pipelines to
better suit your workload by adjusting `ozone.scm.datanode.pipeline.limit` and
`ozone.scm.ec.pipeline.minimum`.
+* **Heap Sizes**: Configure sufficient heap sizes for Ozone Manager (OM),
Storage Container Manager (SCM), Recon, DataNode, S3 Gateway (S3G), and HttpFs
services to ensure stability.
diff --git a/hadoop-hdds/docs/content/start/ProductionDeployment.zh.md
b/hadoop-hdds/docs/content/start/ProductionDeployment.zh.md
new file mode 100644
index 0000000000..4620ccf31b
--- /dev/null
+++ b/hadoop-hdds/docs/content/start/ProductionDeployment.zh.md
@@ -0,0 +1,69 @@
+---
+title: 生产环境部署
+weight: 6
+menu:
+ main:
+ parent: 快速入门
+---
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+本文档旨在为 Apache Ozone 的生产环境部署提供需求和最佳实践的指导。
+
+## 需求
+
+### 系统需求
+
+* **操作系统**: Linux(推荐发行版:Red Hat 8/Rocky 8+、Ubuntu、SUSE;支持架构:x86/ARM)。
+* **Java 开发工具包 (JDK)**: 版本 8 或更高。
+* **时间同步**: 必须启用时间同步服务(如 Chrony 或 ntpd)以防止时间漂移。
+
+### 存储需求
+
+* **元数据存储**: 为确保最佳性能,请使用 SAS SSD 或 NVMe SSD 存储元数据(RocksDB 和 Ratis)。
+* **DataNode 存储**: DataNode 数据存储可使用硬盘。
+* **存储类型**: 请使用直接附加存储。不要使用网络附加存储 (NAS) 或存储区域网络 (SAN)。
+
+### 网络需求
+
+* **网络带宽**: 建议网卡带宽至少为 25Gbps。
+* **网络拓扑**: 为实现可预测的性能,建议采用超分比例低于 3:1 的叶脊网络拓扑。
+
+### 安全需求 (可选但推荐)
+
+* **Kerberos**: 为增强安全性,建议使用包括密钥分发中心 (KDC) 在内的 Kerberos 环境。
+
+## 推荐配置
+
+### Linux 内核
+
+* **CPU 调节器**: 将 CPU 调节驱动设置为 `performance` 模式以最大化性能。
+* **透明大页**: 禁用透明大页以避免性能问题。
+* **SELinux**: 禁用 SELinux。
+* **Swappiness**: 设置 `vm.swappiness=1` 以最小化交换。
+
+### 本地文件系统
+
+* **LVM**: 禁用数据驱动器的逻辑卷管理器 (LVM)。
+* **文件系统**: 使用 `ext4` 或 `xfs` 文件系统。
+* **挂载选项**: 使用 `noatime` 选项挂载驱动器以减少不必要的磁盘写入。对于 SSD,还需添加 `discard` 选项。
+
+### Ozone 配置
+
+* **监控**: 安装 Prometheus 和 Grafana 以监控 Ozone 集群。
+* **管道限制**: 通过调整 `ozone.scm.datanode.pipeline.limit` 和
`ozone.scm.ec.pipeline.minimum` 来增加允许的写入管道数量,以更好地适应您的工作负载。
+* **堆大小**: 为 Ozone Manager (OM)、Storage Container Manager
(SCM)、Recon、DataNode、S3 Gateway (S3G) 和 HttpFs 服务配置足够的堆大小,以确保稳定性。
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]