This is an automated email from the ASF dual-hosted git repository.
weichiu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone.git
The following commit(s) were added to refs/heads/master by this push:
new 7477431c93b HDDS-13372. [Docs] Compare Ozone to other storage
technologies (#8736)
7477431c93b is described below
commit 7477431c93b3804af070a5b2f6d47f0bf02f7b73
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Wed Jul 9 22:30:37 2025 -0700
HDDS-13372. [Docs] Compare Ozone to other storage technologies (#8736)
Generated-by: ChatGPT 4o, Google Gemini Code Assistant.
---
hadoop-hdds/docs/content/concept/Comparison.md | 77 ++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/hadoop-hdds/docs/content/concept/Comparison.md
b/hadoop-hdds/docs/content/concept/Comparison.md
new file mode 100644
index 00000000000..a07aaa67251
--- /dev/null
+++ b/hadoop-hdds/docs/content/concept/Comparison.md
@@ -0,0 +1,77 @@
+---
+title: Comparison with Other Storage Technologies
+menu:
+ main:
+ parent: Architecture
+weight: 10
+---
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+This document provides a high-level comparison of Apache Ozone with other
storage technologies.
+
+## Open Source Scale-out Storage Comparison
+
+Ozone is most often compared against other open source storage systems.
+
+| Tech | Type | Consistency | Scale
| Big Data Integration | License |
Notes |
+|---------------------|-----------------------|---------------------|--------------------------------------|----------------------|-----------------------|--------------------------------------------------------------------|
+| **Ozone** | Object / File | Strong | Exabyte
scale, tens of billions keys | Native | Apache 2.0 |
Modern Hadoop-native object store with S3 API |
+| **HDFS** | File | Strong | PBs,
billions files | Native | Apache 2.0
| Classic Hadoop FS, no S3 API, tight Hadoop integration |
+| **Ceph** | Object / Block / File | Tunable / Eventual |
Multi-PB, very large | Via S3 Gateway/CephFS | LGPLv2.1 / Ceph
Foundation | General-purpose: underlying RADOS (object), RGW (object), CephFS
(file) |
+| **MinIO** | Object | Strong | Petabyte
scale | Via S3 connectors | AGPLv3 (SSPL-like) |
Cloud-native S3 API, lightweight, fast, no FS semantics |
+| **Lustre** | Parallel File (POSIX) | Strong | PB
scale, HPC | None | GPLv2
| HPC clusters, high-throughput parallel file system |
+| **GlusterFS** | File (POSIX) | Eventually consistent| Large,
multi-PB | None | GPLv3 |
General-purpose scale-out distributed file storage |
+| **OpenStack Swift** | Object | Eventual | Large,
multi-PB | Via connectors | Apache 2.0 |
S3-like multi-tenant object storage for private clouds |
+
+Ozone shines when users are in need of an Apache licensed, strongly consistent
storage system that can scale to billions of keys/files and hundreds of PBs to
EBs.
+
+## Proprietary Scale-Out Storage Comparison
+
+| Tech | Type | Consistency | Scale
| Big Data Integration | Performance Focus | Notes
|
+|-------------------------|------------------|-------------|----------------------|----------------------|--------------------------------------|----------------------------------------------------------------------|
+| **Isilon (Dell PowerScale)** | File (Scale-Out NAS) | Strong | PBs,
billions of files | Indirect | High throughput, good mixed IO
| Enterprise NAS, POSIX compliant, good for mixed workloads, backup, analytics |
+| **VAST** | File / Object | Strong | PBs
| Yes, AI workloads | Ultra-low latency, all-flash NVMe | All-flash,
NFS/S3, great for AI/ML and large unstructured datasets |
+| **WEKA** | Parallel File | Strong | PBs
| HPC, AI | Ultra-low latency, high IOPS |
High-performance file, GPU clusters, NFS/SMB/S3 |
+| **Spectrum Scale (GPFS)** | File (POSIX) | Strong | PBs
| HPC, AI | High throughput, scale-out metadata | IBM, used
in HPC/AI, policy tiering, good POSIX compliance |
+| **Scality** | Object | Strong | PBs
| Some | Good throughput for large objects | Enterprise
S3 API, multi-region, backup archives, hybrid cloud |
+| **Cloudian** | Object | Strong | PBs
| Some | Good throughput for backup/archive |
S3-compatible object storage, ransomware protection, hybrid cloud |
+
+
+The proprietary systems offer enterprise-grade quality, but they often require
proprietary or certified hardware.
+Ozone shines when users look for commodity hardware, open systems and embrace
the vibrant Apache big data open source community.
+
+## Cloud-Native Object Storage Comparison
+
+| Tech | Type | Consistency | Scale | Big Data
Integration | Notes
|
+|----------------|-----------------------|-------------|------------|----------------------|-------------------------------------------------------------------------|
+| **AWS S3** | Object | Strong | Exabyte+ | Native
to cloud ecosystem | The de-facto standard for object storage; massive
durability, S3 API leader |
+| **Azure ABFS** | File/Object (Data Lake Storage) | Strong | Exabyte+ |
Azure-native | HDFS-like semantics for Spark/Hadoop; optimized for
analytics |
+| **Google GCS** | Object | Strong | Exabyte+ | Native
to cloud ecosystem | Globally distributed; strong consistency; well-integrated
with BigQuery |
+| **OCI Object Storage** | Object | Strong | Exabyte+ | Via S3
API & native services | Oracle’s S3-compatible storage; integrates with OCI
Data Flow |
+| **Alibaba OSS** | Object | Strong | Exabyte+ | Via S3
API & native services | S3-compatible, huge China/APAC footprint
|
+| **IBM Cloud Object Storage** | Object | Strong | Exabyte+ | Via S3
API & native services | S3-compatible, geo-dispersed erasure coding for
durability |
+
+These cloud storage offerings are only available from their respective public
cloud vendors. In contrast, Ozone runs on-prem or in your private cloud, giving
you full control.
+
+## Summary
+In summary, Ozone is the best fit in the following scenarios:
+
+1. Large on-prem big data clusters migrating from HDFS.
+2. You want S3 APIs but need strong Hadoop integration.
+3. You want to avoid vendor lock-in and grow cost-effectively on commodity
hardware.
+4. You’re building a private or hybrid cloud with other open-source tools.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]