(gravitino) branch main updated: [Design Docs] Iceberg REST supports nested namespace (#10720)

jshao Thu, 30 Apr 2026 00:29:30 -0700

This is an automated email from the ASF dual-hosted git repository.

jerryshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git



The following commit(s) were added to refs/heads/main by this push:
     new 5c1e24bfe1 [Design Docs] Iceberg REST supports nested namespace 
(#10720)
5c1e24bfe1 is described below

commit 5c1e24bfe1070041840ba8dfa0964758fcb48dc7
Author: roryqi <[email protected]>
AuthorDate: Thu Apr 30 15:29:17 2026 +0800

    [Design Docs] Iceberg REST supports nested namespace (#10720)
    
    ### What changes were proposed in this pull request?
    
    Design document of Iceberg REST supports nested namespace
    
    ### Why are the changes needed?
    
    
    ### Does this PR introduce _any_ user-facing change?
    
    No need.
    
    ### How was this patch tested?
    
    Just design document
---
 design-docs/iceberg-supported-nested-namespace.md | 575 ++++++++++++++++++++++
 1 file changed, 575 insertions(+)

diff --git a/design-docs/iceberg-supported-nested-namespace.md 
b/design-docs/iceberg-supported-nested-namespace.md
new file mode 100644
index 0000000000..34cdc58c80
--- /dev/null
+++ b/design-docs/iceberg-supported-nested-namespace.md
@@ -0,0 +1,575 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+# [Iceberg REST] Supported Nested Namespace Design
+
+## Background
+
+This document describes one practical solution to support Iceberg nested 
namespaces in Gravitino.
+The scope is not only UI privilege granting, but also namespace mapping, 
identifier handling,
+authorization scope, and compatibility behavior across Iceberg REST and 
Gravitino.
+
+References:
+
+- https://github.com/apache/gravitino/blob/main/docs/security/access-control.md
+- https://github.com/apache/gravitino/blob/main/docs/iceberg-rest-service.md
+- 
https://github.com/apache/gravitino/blob/main/docs/manage-relational-metadata-using-gravitino.md
+- https://github.com/apache/gravitino/discussions/7296
+
+## Goal
+
+- Support nested namespace operations from Iceberg REST to Gravitino through 
schema mapping.
+- Support privilege granting for different nested namespace scopes (including 
UI workflow).
+- Keep metadata model stable and avoid heavy refactor.
+
+
+## Solution Options
+
+### Option A: Add a new metadata object `NestedNamespace`
+
+Use a new metadata object `NestedNamespace` to represent nested namespace 
explicitly.
+`NestedNamespace` has a one-to-one mapping with Iceberg `Namespace` to avoid 
ambiguity
+with existing Gravitino `Namespace` concepts.
+
+Catalog -> NestedNamespace a -> NestedNamespace a.b -> Table a.b.c
+                              -> NestedNamespace a.c -> NestedNamespace a.c.d 
-> Table a.c.d.e
+
+Pros:
+
+- Clearer concept modeling.
+
+Cons:
+
+- Large refactor across metadata model, API, authorization, and UI.
+
+### Option B (Recommended): Reuse `Schema` entity and enhance schema 
expression capability
+
+Keep physical metadata unchanged (still persisted as `Schema`) and introduce
+`HierarchicalSchema` as a logical expression layer in Iceberg REST adaptation,
+identifier rendering, and authorization scope matching.
+
+Pros:
+
+- Low-impact evolution path without introducing a new metadata entity.
+- Decouples nested namespace semantics from `.` and reduces parser ambiguity.
+- Reuses existing metadata and authorization model to reduce implementation 
risk.
+
+Cons:
+
+- Requires explicit conversion rules between logical path and physical schema 
name.
+- Authorization matching and identifier serialization become more complex.
+#### Option B Separator: Configurable logical separator (default `:`)
+
+Examples:
+
+- `A:B:C` as logical `HierarchicalSchema` path when separator is `:`.
+- Physical schema name remains mapped through conversion layer.
+
+Pros:
+
+- Better readability than escaping `.` in many clients and UI forms.
+- Lower routing conflict risk than `/`.
+- Easier to keep backward compatibility with existing non-nested schema 
handling.
+
+Cons:
+
+- Needs clear validation rule to avoid ambiguity with existing schema names 
containing configured
+  separator.
+
+## Design
+
+### Identifier Rules
+
+- Introduce logical identifier concept: `HierarchicalSchema`.
+- `HierarchicalSchema` uses a configurable logical separator (default `:`) in 
API/logic layer.
+- For Gravitino REST create/update schema APIs, `request.getName()` keeps the 
logical schema name
+  and may contain `:` (for example `A:B` or `A:B:C`).
+- Before persisting to `EntityStore`, schema path is normalized to ASCII-1 
(`\u0001`)-separated physical schema
+  name.
+- Escaping strategy: each path segment is encoded before physical flattening 
to avoid ambiguity.
+- Configured logical separator is reserved as hierarchy separator and is not 
allowed inside one
+  namespace segment.
+- `.` inside one segment is allowed and must not be treated as hierarchy 
separator.
+- Parsing is direct split/join by configured logical separator at API boundary.
+- Keep flat storage model and convert `HierarchicalSchema` path to physical 
schema name by mapping rules.
+- Identifier rendering rule:
+  - Use encoded `HierarchicalSchema` path directly in schema position.
+  - Do not rely on single-quote wrapping for schema disambiguation in this 
phase.
+
+Examples:
+
+- Nested namespace `A:B` maps to logical `HierarchicalSchema` path `A:B` 
(assuming configured
+  separator is `:`).
+- Nested namespace `A:B:C` maps to logical `HierarchicalSchema` path `A:B:C` 
(assuming configured
+  separator is `:`).
+- Logical `HierarchicalSchema` path is then converted to physical schema name 
through mapping rules.
+- Namespace levels `["team", "sales"]` are serialized using configured 
separator, e.g.
+  `team:sales`.
+- Parsing `team:sales` returns `["team", "sales"]` when separator is `:`.
+- Identifier rendering example:
+  - `metalake.catalog.A:B.table1`
+  - `metalake.catalog.team:sales.table2`
+- In UI display and API transport, use logical path directly (for example 
`A:B:C`).
+
+### Physical Name Mapping and Reversibility
+
+- **Persisted schema name in `EntityStore` always uses ASCII-1 (`\u0001`) as 
the internal storage separator** for
+  stable storage semantics.
+- External request/response handling uses configured logical separator and 
converts at API boundary.
+- Connector-facing behavior remains Iceberg-compatible and does not require 
users to configure or
+  input internal storage representation.
+- Mapping must be reversible:
+  - `logical path segments` -> `encode each segment` -> `join by '\u0001'` for 
physical storage.
+  - physical schema name -> `split by '\u0001'` -> `decode each segment` -> 
logical path segments.
+  - This avoids ambiguity when one segment contains `.` (for example 
`my.schema`).
+
+### Existing Name Compatibility and Migration Guard
+
+- Before enabling nested namespace mode for a catalog, run a pre-check scan on 
existing schema names
+  against configured logical separator.
+- If existing schema names conflict with selected separator, enabling is 
rejected with actionable
+  error and user can choose another separator or rename conflicting schema 
names.
+- Once nested mode is enabled for a catalog, creating new schema names 
containing configured logical
+  separator as plain text is rejected.
+
+### Delimiter Configuration Policy
+
+- Logical separator is configurable (for example `:`, `;`, `$`) and can be 
chosen to avoid conflict
+  with existing names.
+- Physical separator in storage remains fixed as ASCII-1 (`\u0001`).
+- Recommended: keep logical separator stable after nested namespace is enabled 
for a catalog.
+- Delimiter is configured at server level for nested-namespace parsing 
behavior.
+
+Two delimiter-governance options are under evaluation:
+
+- **Option 1 (restricted delimiter set)**:
+  - Server only accepts delimiters from a predefined allowed set.
+  - Delimiter is treated as a reserved hierarchy marker in nested-aware flows.
+  - Behavior difference for new object creation:
+    - Iceberg nested schema creation that uses delimiter as hierarchy path is 
allowed.
+    - Hive creation of a non-nested schema name that contains the configured 
delimiter is rejected
+      with validation error.
+- **Option 2 (unrestricted delimiter)**:
+  - Server allows any delimiter value configured by users.
+  - Compatibility is prioritized for engines that treat schema name as plain 
string.
+  - Behavior difference for new object creation:
+    - Hive can create a non-nested schema successfully even when the schema 
name contains the
+      configured delimiter.
+    - Nested interpretation is applied only in nested-aware request paths (for 
example Iceberg
+      namespace APIs), not as a blanket rule for all engines.
+
+### Delimiter Validation and Rejection Rationale
+
+Delimiter validity should be explicit and observable, not implicit.
+
+- Validation checkpoints:
+  - Validate delimiter when server starts or when delimiter configuration is 
updated.
+  - Re-validate against existing catalog schema names before enabling nested 
mode for a catalog.
+  - Reject invalid delimiter configuration early before request-time namespace 
operations.
+- Validation rules:
+  - Delimiter must be a single non-empty character.
+  - Delimiter must not be `.` and must not be ASCII-1 (`\u0001`) because both 
are reserved by
+    internal storage/compatibility rules.
+  - Delimiter should avoid characters that cause parser or route ambiguity in 
REST/SQL contexts
+    (for example `/`).
+  - Under Option 1, delimiter must belong to server predefined allowlist.
+  - Under Option 2, delimiter is user-defined but still must pass safety 
checks above.
+- Rejection rationale (why some delimiters are not permitted):
+  - Avoid hierarchical parsing ambiguity and inconsistent split/join behavior.
+  - Preserve compatibility with existing schema/table identifier parsing 
across engines.
+  - Prevent migration risk where existing names collide with hierarchy 
semantics.
+  - Keep cross-engine behavior predictable when Iceberg and Hive treat schema 
names differently.
+- Error reporting requirements:
+  - Return actionable validation error containing rejected delimiter, 
rejection reason, and
+    suggested alternatives.
+  - Example: "Delimiter '/' is not allowed because it conflicts with path 
parsing. Try ':', ';',
+    or '$'."
+
+
+### Parsing Sequence Diagram
+
+```mermaid
+sequenceDiagram
+  participant Client as Client/Connector
+  participant Config as Connector Config
+  participant API as Gravitino REST API
+  participant Mapper as HierarchicalSchema Mapper
+  participant Store as EntityStore
+
+  Client->>Config: Load namespace separator
+  Config-->>Client: separator (for example ':')
+  Client->>API: GET /schemas?parentSchema=A:B
+  API->>Mapper: Parse parentSchema by separator ':'
+  Mapper->>Store: list all schemas
+  Store-->>Mapper: all schemas
+  Mapper->>Mapper: filter direct children under A:B
+  Mapper-->>API: hierarchy-aware result
+  API-->>Client: response (Iceberg-compatible namespace view)
+```
+
+
+### Iceberg REST Side Behavior
+
+- **Create nested namespace**:
+  - Creating `A:B:C` creates (or ensures existence of) `A`, `A\u0001B`, and 
`A\u0001B\u0001C` in one atomic
+    operation.
+  - Parent-chain creation is transactional: if any step fails, all created 
parents in this request
+    are rolled back and no partial parent namespaces remain.
+  - Owner assignment rule for auto-created parents:
+    - If a parent is newly auto-created, owner must be the same as the owner 
assigned to the final
+      target namespace in the same create request.
+    - The full auto-created parent chain and the final target namespace must 
share a consistent owner
+      value for that operation.
+- **Update nested namespace**:
+  - Support updating namespace properties through mapped schema operations.
+  - Property update is applied to the mapped target namespace scope.
+- **Drop nested namespace**:
+  - Iceberg REST side does not support cascade delete for namespace.
+  - If cascade parameter is provided, return unsupported-parameter error 
directly.
+  - Drop must fail when target namespace still contains child namespaces, 
tables, or views.
+  - Users must delete children objects explicitly before dropping parent 
namespace.
+- **Rename nested namespace**: not needed because Iceberg REST does not 
support namespace rename.
+
+### Gravitino Side Behavior
+
+- `list schema` should express nested hierarchy semantics for users.
+- `list schema` REST API (GET 
`/metalakes/{metalake}/catalogs/{catalog}/schemas`) should support an
+  optional query parameter `parentSchema`.
+  - When `parentSchema` is absent, return top-level schemas only (first layer).
+  - When `parentSchema` is provided, return only the
+    direct child schemas under the
+    given parent (next layer), instead of the full subtree.
+  - `parentSchema` value follows direct `HierarchicalSchema` path format with 
configured separator
+    (for example `A:B` when separator is `:`).
+- Gravitino does not provide a dedicated `list sub-schema` API; hierarchy is 
expressed via
+  `list schema`/`list namespaces` results.
+- Example: for schemas `A`, `B`, `A:B`, `A:B:C`, hierarchy view is `A -> A:B 
-> A:B:C` and `B`;
+  root listing returns `A` and `B`, and querying parent `A` returns `A:B`.
+- To make nested semantics explicit, `list namespaces` should express 
parent-child relationships
+  (hierarchical view) even when underlying storage is flat.
+- Example hierarchical view from flat schemas: `A` -> `A:B` -> `A:B:C`, and 
`B` as another root.
+- This list-level hierarchical expression is the primary semantic model for 
users, reducing
+  ambiguity caused by one request creating multiple physical schema objects.
+- Gravitino server REST supports namespace create/update/drop operations for 
nested namespace
+  workflows, aligned with Iceberg REST behavior.
+- Existing schema/table APIs remain compatible with non-nested cases.
+
+Examples (Gravitino REST side):
+
+- **Create from Gravitino side**
+  - Request: `POST /metalakes/m1/catalogs/c1/schemas` with `name=A:B:C`
+  - Behavior: ensure parent chain exists (`A`, `A:B`) and then create `A:B:C`.
+- **List from Gravitino side**
+  - Request: `GET /metalakes/m1/catalogs/c1/schemas`
+  - Behavior: return top-level schemas only (first layer), for example `A`, 
`B`.
+  - Request: `GET /metalakes/m1/catalogs/c1/schemas?parentSchema=A:B`
+  - Behavior: return direct children of `A:B` only (next layer), for example 
`A:B:C`, `A:B:D`.
+- **Alter from Gravitino side**
+  - Request: `PUT /metalakes/m1/catalogs/c1/schemas/A:B:C` with updates
+    (for example set/remove properties).
+  - Behavior: update properties on target schema `A:B:C` only; parent scopes 
are not modified.
+- **Delete from Gravitino side**
+  - Request: `DELETE /metalakes/m1/catalogs/c1/schemas/A:B`
+  - Behavior: fail if `A:B` still contains child namespace/table.
+  - Behavior: no cascade mode; children must be removed first, then delete 
`A:B`.
+  - Request with cascade parameter (for example `?cascade=true`): reject with 
unsupported-parameter
+    error.
+
+## Privileges and Authorization
+
+- Authorization follows nested namespace scope by logical `HierarchicalSchema` 
path and mapped schema name.
+- Namespace privileges follow inheritance: privilege on parent namespace 
applies to child namespace.
+- For operations requiring `USE_SCHEMA`, authorization succeeds if any 
ancestor scope
+  (including the current schema scope) has `USE_SCHEMA`; it is not required on 
every level.
+- Effective rule for `A:B:C`: check `A:B:C` -> `A:B` -> `A`, and pass on the 
first scope that has
+  `USE_SCHEMA`.
+- Parent-chain walking overhead is mitigated by existing `EntityStore` cache 
behavior; no additional
+  authorization-layer cache is introduced in this phase.
+- UI privilege granting is one usage scenario of this overall nested namespace 
solution.
+- To prevent privilege escalation, auto-creating missing parent namespaces 
requires one of:
+  - requester has `create_schema` on each missing parent scope, or
+  - requester has dedicated admin capability for bootstrap parent creation.
+- If neither condition is met, create `A:B:C` must fail instead of implicitly 
creating unauthorized
+  parent scopes.
+
+### Client API Surface (Java/Python)
+
+- Java client:
+  - Keep existing string-based methods for compatibility.
+  - Only add `listSchemas(parentSchema)` support so callers can request direct 
children under a
+    specific parent namespace.
+  - For create/update/drop and other schema operations, continue using 
existing string-based schema
+    name parameters; users compose hierarchical schema names themselves (for 
example `A:B:C`).
+  - No new typed path wrapper (`SchemaPath`) or segment-based overloads are 
introduced in this
+    phase.
+
+Simple Java client examples: (Users can know the delimiter)
+
+- `listSchemas()` -> returns top-level schemas, for example `A`, `B`.
+- `listSchemas("A:B")` -> returns direct children, for example `A:B:C`, 
`A:B:D`.
+- `createSchema("A:B:C", ...)` -> users pass composed schema name directly.
+
+### Option P1 (Recommended): Extend `create_schema` semantics
+
+- Keep current privilege model and do not add a new privilege type.
+- Clarify `create_schema` as container-scoped capability: permission on parent 
namespace allows
+  creating direct child namespace under that scope.
+- Example: `create_schema` on `A` allows creating `A:B`, and `create_schema` 
on `A:B` allows
+  creating `A:B:C`.
+
+Pros:
+
+- Lowest implementation and migration cost.
+- Reuses existing authorization model and UI privilege workflow.
+- Keeps backward compatibility for current grants.
+
+Cons:
+
+- Semantics are less explicit because `create_schema` now covers both normal 
schema creation and
+  nested namespace creation.
+
+### Option P2: Introduce a dedicated nested-namespace privilege
+
+- Add a new privilege (for example `create_nested_namespace`) for creating 
child namespaces.
+- Keep `create_schema` semantics unchanged for existing schema creation 
behavior.
+- Evaluate both privileges independently in authorization expression where 
needed.
+
+Pros:
+
+- Clearer and more explicit permission model.
+- Better long-term extensibility for fine-grained namespace governance.
+
+Cons:
+
+- Requires privilege model/API/UI updates and migration planning.
+- Increases operational complexity for users and administrators.
+
+### Selection Guidance
+
+- Phase-1 recommends Option P1 for faster delivery and lower risk.
+- Option P2 can be considered in a later phase if stronger permission 
separation is required.
+
+Examples:
+
+- Privilege on `A:B` applies to that specific scope.
+- Privilege on `A` also applies to `A:B` (or other configured child path) 
based on the namespace inheritance rule.
+
+## Code Snippets (Design-Level)
+
+The following snippets are design-level examples to clarify how 
`HierarchicalSchema`
+(`:` preferred) should be converted and consumed in key code paths.
+
+### Snippet 1: Convert Iceberg namespace to logical path and physical schema
+
+```java
+// Example utility methods in IcebergRESTUtils (or a dedicated 
HierarchicalSchemaUtil)
+public static String serializeHierarchicalPath(String[] levels) {
+  // ["team", "sales"] -> "team:sales"
+  // logical separator is configurable and not allowed inside one level.
+  return String.join(configuredSeparator, levels);
+}
+
+public static String[] parseHierarchicalPath(String path) {
+  // "team:sales" -> ["team", "sales"]
+  return path.split(Pattern.quote(configuredSeparator), -1);
+}
+
+public static String toPhysicalSchemaName(String hierarchicalPath) {
+  // Reversible mapping: encode each segment before join.
+  // Example: ["my.schema", "sales"] -> "my%2Eschema.sales"
+  return encodeAndJoinSegments(parseHierarchicalPath(hierarchicalPath));
+}
+```
+
+### Snippet 2: Namespace extraction in authorization interceptor
+
+```java
+// Example in IcebergMetadataAuthorizationMethodInterceptor
+Namespace rawNamespace = RESTUtil.decodeNamespace(value);
+String hierarchicalPath = HierarchicalSchemaUtil.toPath(rawNamespace, ":");
+String schema = HierarchicalSchemaUtil.toPhysicalSchemaName(hierarchicalPath);
+
+nameIdentifierMap.put(
+    Entity.EntityType.SCHEMA,
+    NameIdentifierUtil.ofSchema(metalakeName, catalog, schema));
+```
+
+### Snippet 3: Parent-scope authorization check
+
+```java
+// Example path inheritance for A:B:C
+List<String> authzScopes = HierarchicalSchemaUtil.parentScopes("A:B:C");
+// Result: ["A", "A:B", "A:B:C"]
+// Authorization passes if user has required privilege on any allowed parent 
scope by policy.
+```
+
+
+## Affected Classes
+
+### Iceberg REST and namespace dispatch
+
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/iceberg/service/IcebergRESTUtils.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/iceberg/service/dispatcher/IcebergNamespaceOperationDispatcher.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/iceberg/service/dispatcher/IcebergNamespaceOperationExecutor.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/iceberg/service/dispatcher/IcebergNamespaceEventDispatcher.java`
+
+### Authorization interception
+
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/server/web/filter/BaseMetadataAuthorizationMethodInterceptor.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/server/web/filter/IcebergMetadataAuthorizationMethodInterceptor.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/server/web/filter/LoadTableAuthzHandler.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/server/web/filter/RenameTableAuthzHandler.java`
+- 
`iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/server/web/filter/RenameViewAuthzHandler.java`
+
+### Identifier and metadata object mapping
+
+- `api/src/main/java/org/apache/gravitino/NameIdentifier.java`
+- `core/src/main/java/org/apache/gravitino/utils/NameIdentifierUtil.java`
+
+### Tests
+
+- 
`iceberg/iceberg-rest-server/src/test/java/org/apache/gravitino/server/web/filter/TestIcebergMetadataAuthorizationMethodInterceptor.java`
+- `api/src/test/java/org/apache/gravitino/TestNameIdentifier.java`
+- `core/src/test/java/org/apache/gravitino/utils/TestNameIdentifierUtil.java`
+
+## Expected Changes
+
+### 1) Namespace path mapping
+
+- Add a dedicated conversion utility for `HierarchicalSchema` path:
+  - Iceberg namespace levels -> logical path (preferred `:`).
+  - Logical path -> physical schema name (phase-1 uses ASCII-1 `\u0001` 
flattened mapping).
+- Override `Capability.specificationOnName(SCHEMA, name)` naming rules for 
related catalogs so
+  schema names containing `:` are accepted in this phase.
+
+### 2) Authorization behavior
+
+- In interceptor and handlers, stop treating the last namespace level as the 
only schema segment.
+- Build schema identity from the full namespace path through conversion rules.
+- Evaluate parent-scope inheritance using hierarchical logical scopes (`A`, 
`A:B`, `A:B:C`) before or during expression evaluation.
+- For `USE_SCHEMA` checks, short-circuit on the nearest scope that has 
permission; do not require
+  `USE_SCHEMA` on each intermediate level.
+
+### 3) Namespace operation behavior
+
+- `createNamespace` should ensure parent schemas exist for each hierarchical 
level.
+- Parent-chain creation must be atomic as one transaction; failures must not 
leave partial parents.
+- `updateNamespace` should support property updates for mapped namespace scope.
+- `dropNamespace` should target mapped physical schema and preserve existing 
non-nested behavior.
+- `listSchemas` should accept optional query parameter `parentSchema`.
+  - Absent `parentSchema`: return top-level schemas only.
+  - Present `parentSchema`: return direct children only.
+- `listNamespaces` should return hierarchy-aware semantics (or equivalent 
parent-child expression)
+  while keeping current flat storage model.
+- Catalog implementations must support namespace lifecycle APIs 
(create/list/alter/drop namespace)
+  for this feature path.
+
+### 4) Identifier compatibility
+
+- Keep `NameIdentifier` external compatibility for existing dotted identifiers.
+- Add schema-level rendering/parsing guidance for logical separator and quoted 
schema output where ambiguity exists.
+- Keep change scope limited to schema handling in this phase to reduce 
regression risk for table/view/function paths.
+
+## Compatibility
+
+- No metadata model migration required.
+- Existing non-nested namespace behavior remains unchanged.
+- Limiting quoted identifier parsing to `schema` reduces regression risk for 
catalog/table/view/function identifier parsing.
+- Internal representation in Gravitino may use `:` for hierarchical schema 
semantics.
+- Spark/Flink/Trino connector-facing namespace representation must stay 
consistent with Iceberg
+  conventions and should not require users to input Gravitino-internal `:` 
format directly.
+- Connector layer is responsible for translation between Iceberg-style 
namespace representation and
+  Gravitino internal hierarchical schema representation.
+- Connector changes are mandatory in Spark/Flink/Trino integration layers; 
this is not a pure
+  server-only change.
+- Upgrade guard:
+  - validate selected logical separator against existing schema names before 
enabling nested mode.
+  - if conflict exists, users can pick another logical separator or rename 
conflicting schemas.
+
+Rationale:
+
+- Keep connector-facing behavior aligned with Iceberg to preserve existing 
engine user experience
+  and avoid introducing a Gravitino-specific namespace syntax into 
Spark/Flink/Trino SQL or APIs.
+- Reduce migration cost and compatibility risk for current Iceberg workloads, 
scripts, and
+  operational tooling that already assume Iceberg namespace conventions.
+- Isolate internal representation changes inside Gravitino/connector 
translation boundaries, so
+  future internal evolution does not force external breaking changes for 
engine integrations.
+
+Engine Delimiter Support (Connector-facing):
+
+| Engine | User-facing namespace delimiter | Requires internal `:` input? | 
Notes |
+|---|---|---|---|
+| Spark connector | `a.b.c` style namespace string | No | Spark side uses 
dotted namespace representation and connector converts to internal format. |
+| Flink connector | `a.b.c` style namespace string | No | Flink side uses 
dotted namespace representation and connector converts to internal format. |
+| Trino connector | `"a.b.c"` style quoted namespace string | No | Trino side 
uses quoted dotted namespace representation and connector converts to internal 
format. |
+
+All engines should keep external namespace semantics aligned with Iceberg. 
Internal `:` is an
+implementation detail inside Gravitino and connector translation logic.
+
+### Connector `NestedNameIdentifier` Conversion
+
+Connector side must explicitly convert namespace representation when building 
and parsing
+`NestedNameIdentifier`/`NameIdentifier` values:
+
+- **Input parsing (engine -> connector)**:
+  - Read namespace in engine-native/Iceberg style.
+  - Build connector logical path first, then convert to Gravitino internal 
schema representation.
+- **Request construction (connector -> Gravitino)**:
+  - For schema-level operations, send converted schema name in 
request/path/query expected by
+    Gravitino REST.
+  - For list operations, `parentSchema` must use the converted internal 
hierarchical schema
+    representation.
+- **Response rendering (Gravitino -> connector -> engine)**:
+  - Convert internal hierarchical schema representation back to 
engine-native/Iceberg style before
+    returning identifiers to Spark/Flink/Trino users.
+- **Round-trip requirement**:
+  - `engine identifier -> connector converted identifier -> server -> 
connector rendered identifier`
+    must be stable and lossless for nested namespace paths.
+
+Example conversion flow:
+
+- Engine input: namespace `[A, B, C]`, table `t1`
+- Connector logical identifier: `A.B.C.t1` (engine-facing)
+- Connector to Gravitino identifier: schema `A:B:C`, table `t1`
+- Gravitino response schema: `A:B:C`
+- Connector rendered back to engine: namespace `[A, B, C]`, table `t1`
+
+Trino-specific example:
+
+- Trino input namespace string: `"a.b.c"`
+- Connector converts to Gravitino internal schema: `a:b:c`
+- Gravitino returns schema: `a:b:c`
+- Connector renders back to Trino quoted style: `"a.b.c"`
+
+## Plans
+-  Add foundational classes for nested namespace authorization
+-  Add nested namespace support to Iceberg REST API
+-  Add nested namespace support to Gravitino REST AP
+
+## No Plan
+- Trino connector supports nested namespace
+- Spark connector supports nested namespace
+- Flink connector supports nested namespace
+
+## Test Plan
+
+- Unit tests for schema name parse/quote handling when name contains `.`.
+- Unit/integration tests for Iceberg REST create/update/drop nested namespace 
mapping.
+- Authorization tests for nested scope behavior (`A`, `A:B`, `A:B:C`).
+- Regression tests for non-nested namespace authorization behavior.
+

(gravitino) branch main updated: [Design Docs] Iceberg REST supports nested namespace (#10720)

Reply via email to