This is an automated email from the ASF dual-hosted git repository.

ritesh pushed a commit to branch HDDS-9225-website-v2
in repository https://gitbox.apache.org/repos/asf/ozone-site.git


The following commit(s) were added to refs/heads/HDDS-9225-website-v2 by this 
push:
     new e899522b HDDS-9864. Add overview documentation (#147)
e899522b is described below

commit e899522ba5380b009a5299fdc1204998b745f558
Author: Ritesh H Shukla <[email protected]>
AuthorDate: Thu Jul 10 13:59:12 2025 -0700

    HDDS-9864. Add overview documentation (#147)
---
 .markdownlintignore |  1 +
 cspell.yaml         |  6 ++++
 docs/01-overview.md | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/.markdownlintignore b/.markdownlintignore
index 3f9d4c8d..15867b8d 100644
--- a/.markdownlintignore
+++ b/.markdownlintignore
@@ -17,3 +17,4 @@
 
 build
 node_modules
+CLAUDE.md
diff --git a/cspell.yaml b/cspell.yaml
index cdc8643e..f864ca45 100644
--- a/cspell.yaml
+++ b/cspell.yaml
@@ -58,6 +58,9 @@ flagWords:
 - quasi
 # RocksDB docs do not hyphenate this term.
 - column-family
+# Exclude CLAUDE.md from spell checking as it contains development-specific 
terms
+ignorePaths:
+- CLAUDE.md
 
 # List of words to be always considered correct.
 # Case insensitive.
@@ -121,3 +124,6 @@ words:
 - UX
 - devs
 - CLI
+- lakehouse
+- Flink
+- rebalancing
diff --git a/docs/01-overview.md b/docs/01-overview.md
index 2808232c..6ec52a98 100644
--- a/docs/01-overview.md
+++ b/docs/01-overview.md
@@ -5,8 +5,99 @@ slug: /
 
 # Overview
 
-**TODO:** [HDDS-9864](https://issues.apache.org/jira/browse/HDDS-9864) 
complete this page
+## What is Apache Ozone?
 
-## What is Ozone?
+Apache Ozone is a scalable, distributed object store designed for lakehouse 
workloads,
+AI/ML, and cloud-native applications.
+Originating from the BigData analytics ecosystem, it handles both small and 
large files,
+supporting deployments up to billions of objects and exabytes of capacity.
+Ozone provides strong consistency guarantees,
+multiple protocol interfaces (including S3 compatibility), and configurable 
durability options.
 
-## Features
+## What it does?
+
+Ozone includes features relevant to large-scale storage requirements:
+
+### Scale
+
+Ozone's architecture separates metadata management from data storage. The 
Ozone Manager (OM) and
+Storage Container Manager (SCM) handle metadata operations, while Datanodes 
manage the physical storage of data blocks.
+This design allows for independent scaling of these components and supports 
incremental cluster growth.
+
+### Flexible Durability
+
+Ozone offers configurable data durability options per bucket or per object:
+
+- **Replication (RATIS):** Uses 3-way replication via the [Ratis 
(Raft)](https://ratis.apache.org) consensus protocol for high availability.
+- **Erasure Coding (EC):** Supports various EC codecs (e.g., Reed-Solomon) to 
reduce storage overhead compared to replication while maintaining specified 
durability levels.
+
+### Secure
+
+Security features are integrated at multiple layers:
+
+- **Authentication:** Supports Kerberos integration for user and service 
authentication.
+- **Authorization:** Provides Access Control Lists (ACLs) for managing 
permissions at the volume, bucket, and key levels. Supports Apache Ranger 
integration for centralized policy management.
+- **Encryption:** Supports TLS/SSL for data in transit and Transparent Data 
Encryption (TDE) for data at rest.
+- **Tokens:** Uses delegation tokens and block tokens for access control in 
distributed operations.
+
+### Performance
+
+Ozone's design considers performance for different access patterns:
+
+- **Throughput:** Intended for streaming reads and writes of large files. Data 
can be served directly from Datanodes after initial metadata lookup.
+- **Latency:** Metadata operations are managed by OM and SCM, designed for 
low-latency access.
+- **Small File Handling:** Includes mechanisms for managing metadata and 
storage for large quantities of small files.
+
+### Multiple Protocols
+
+Applications can access data stored in Ozone through several interfaces:
+
+- **S3 Protocol:** Provides an S3-compatible REST API, allowing use with 
S3-native applications and tools.
+- **Hadoop Compatible File System (ofs):** Offers the `ofs://` scheme for 
integration with Hadoop ecosystem tools (e.g., Iceberg, Spark, Hive, Flink, 
MapReduce).
+- **Native Java Client API:** A client library for Java applications.
+- **Command Line Interface (CLI):** Provides tools for administrative tasks 
and data interaction.
+
+### Efficient Storage Use
+
+Ozone includes features aimed at optimizing storage utilization:
+
+- **Erasure Coding:** Can reduce the physical storage footprint compared to 3x 
replication.
+- **Small File Handling:** Manages metadata and block allocation for small 
files.
+- **Containerization:** Groups data blocks into larger Storage Containers, 
which can simplify management and disk I/O.
+
+### Storage Management
+
+Ozone uses a hierarchical namespace and provides management tools:
+
+- **Namespace:** Organizes data into Volumes (often mapped to tenants) and 
Buckets (containers for objects), which hold Keys (objects/files).
+- **Quotas:** Administrators can set storage quotas at the Volume and Bucket 
levels.
+- **Snapshots:** Supports point-in-time, read-only snapshots of buckets for 
data protection and versioning.
+
+### Strong Consistency
+
+Ozone provides strong consistency for metadata and data operations. Reads 
reflect the results of the latest successfully completed write operations.
+
+## Key Characteristics
+
+The design of Ozone leads to certain characteristics relevant for large-scale 
data management:
+
+### Storage Costs
+
+Factors influencing storage costs include:
+
+- **Storage Efficiency:** Erasure Coding can reduce physical storage 
requirements.
+- **Hardware:** Designed to run on commodity hardware.
+- **Licensing:** Apache Ozone is open-source software under the Apache License 
2.0.
+- **Scalability:** Clusters can be expanded by adding nodes or racks. Data 
rebalancing mechanisms help manage utilization.
+
+### Operations
+
+Aspects related to storage administration include:
+
+- **Unified Storage:** Can potentially serve as a common storage layer for 
different types of workloads.
+- **Management Tools:** Includes the Recon web UI for monitoring and CLI tools 
for administration.
+- **Maintenance:** Supports features like rolling upgrades, node 
decommissioning, and data balancing.
+
+### Hybrid Cloud Scenarios
+
+Ozone's S3 compatibility allows applications developed for S3 to run 
on-premises using Ozone. This can be relevant for hybrid cloud strategies or 
migrating workloads between on-premises and cloud environments.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to