jerryshao commented on code in PR #10724:
URL: https://github.com/apache/gravitino/pull/10724#discussion_r3079365314


##########
design-docs/gravitino-logical-view-management.md:
##########
@@ -0,0 +1,1051 @@
+# Design of Logical View Management in Gravitino
+
+## Background
+
+In modern data lakehouse architectures, views serve as a fundamental 
abstraction for data access, security enforcement, and query simplification. 
Organizations leverage multiple query engines (Trino, Spark, Hive) to access 
the same underlying data, but view management across these heterogeneous 
systems presents significant challenges:
+
+- **Portability Gap**: A view created in Trino cannot be read by Spark, and 
vice versa, due to differences in SQL dialects and metadata storage formats.
+- **Fragmented Governance**: Views are scattered across different metastores 
(HMS, Iceberg REST Catalog, engine-specific stores), making unified access 
control and auditing difficult.
+- **Inconsistent Security**: Each engine implements its own security model 
(definer/invoker), leading to inconsistent access control behavior across the 
data platform.
+
+Apache Gravitino, as a unified metadata management system, is well-positioned 
to address these challenges by providing centralized view management with 
multi-engine compatibility.
+
+---
+
+## Goals
+
+1. **Multi-Engine Compatibility**: Views managed by Gravitino are visible and 
manageable across engines. Multi-dialect SQL representation storage enables 
cross-engine view sharing.
+
+2. **Unified View Management**: Provide standard CRUD operations for views:
+   - Create view
+   - Get/List views
+   - Alter view (update SQL, add representations, modify properties)
+   - Drop view
+
+3. **Capability-Driven Storage Strategy**: Automatically select the optimal 
storage strategy based on each catalog's capabilities — no user-facing storage 
mode configuration needed. Gravitino transparently handles delegation, 
extension, and full management per catalog type.
+
+4. **Access Control Integration**: Integrate with Gravitino's existing access 
control framework to provide metadata-level privileges (CREATE_VIEW, 
SELECT_VIEW, DROP_VIEW). Data-level access control remains the responsibility 
of the underlying compute engines.
+
+5. **Audit Support**: View operations should be auditable with complete audit 
information.
+
+6. **Event System Integration**: View operations should emit events for users 
to hook into.
+
+---
+
+## Non-Goals
+
+1. **Materialized Views**: This design focuses on logical views only. 
Materialized views with physical storage are out of scope. IRC-based 
materialized views are a planned follow-on that builds on the logical view 
infrastructure established here; they represent a meaningful product 
differentiator that no other open metadata catalog currently offers.
+
+2. **Temporary Views**: Session-scoped temporary views are managed by engines 
themselves and don't require persistent management.
+
+3. **SQL Transpilation**: Gravitino will not automatically convert SQL between 
dialects. Users are responsible for providing correct SQL representations for 
each target dialect.
+
+4. **Query Execution**: Gravitino manages view metadata only. Actual query 
execution is handled by the compute engines.
+
+---
+
+## Proposal
+
+### Namespace
+
+Views are registered under a specified schema in relational catalogs, 
following the three-level namespace hierarchy:
+
+```
+metalake
+  └── catalog (relational)
+        └── schema
+              └── view
+```
+
+This is consistent with Gravitino's existing namespace design for tables and 
functions. **Views and tables share the same namespace within a schema** — a 
view and a table cannot have the same name under the same schema. This follows 
the standard behavior of most relational databases (MySQL, PostgreSQL, Hive, 
etc.).
+
+---
+
+### View Metadata Model
+
+#### Core View Structure
+
+```
+View
+├── name: string                          # View name (unique within schema, 
shared namespace with tables)
+├── comment: string                       # Optional description
+├── columns: array<ViewColumn>            # View schema definition
+│   └── ViewColumn
+│       ├── name: string
+│       ├── type: DataType
+│       └── comment: string (optional)
+├── representations: array<Representation>    # Multi-dialect view definitions 
(one per dialect)
+│   └── Representation
+│       ├── type: string                      # Representation type, currently 
only "sql"
+│       └── SQLRepresentation (type="sql")
+│           ├── dialect: string               # e.g., "trino", "spark", "hive" 
(unique within a view)
+│           ├── sql: string                   # The view definition SQL
+│           ├── defaultCatalog: string        # Default catalog for 
unqualified refs
+│           └── defaultSchema: string         # Default schema for unqualified 
refs
+├── securityConfig: SecurityConfig

Review Comment:
   Using `xxxConfig` is a little mislead, do we more fields other than security 
mode?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to