igorbelianski-cyber commented on code in PR #11041:
URL: https://github.com/apache/iceberg/pull/11041#discussion_r2791139552


##########
format/view-spec.md:
##########
@@ -160,6 +176,109 @@ Each entry in `version-log` is a struct with the 
following fields:
 | _required_  | `timestamp-ms` | Timestamp when the view's 
`current-version-id` was updated (ms from epoch) |
 | _required_  | `version-id`   | ID that `current-version-id` was set to |
 
+#### Storage Table Identifier
+
+The table identifier for the storage table that stores the precomputed results.
+
+| Requirement | Field name     | Description |
+|-------------|----------------|-------------|
+| _required_  | `namespace`    | A list of strings for namespace levels |
+| _required_  | `name`         | A string specifying the name of the table |
+
+### Storage table metadata
+
+This section describes additional metadata for the storage table that 
supplements the regular table metadata and is required for materialized views.
+The property "refresh-state" is set on the [snapshot 
summary](https://iceberg.apache.org/spec/#snapshots) property of every storage 
table snapshot to determine the freshness of the precomputed data of the 
storage table.
+
+| Requirement | Field name      | Description |
+|-------------|-----------------|-------------|
+| _required_  | `refresh-state` | A [refresh state](#refresh-state) record 
stored as a JSON-encoded string |
+
+#### Freshness
+
+A materialized view is "fresh" when the storage table adequately represents 
the logical query definition of the view.
+Since different systems define freshness differently, it is left to the 
consumer to evaluate freshness based on its own policy.
+
+**Consumer behavior:**
+
+When evaluating freshness, consumers:
+- May apply time-based freshness policies, such as allowing a staleness window 
based on `refresh-start-timestamp-ms`.
+- May compare the `source-states` list against the states loaded from the 
catalog to verify the producer's freshness interpretation.
+- May parse the view definition to implement more sophisticated policies.
+- When a materialized view is considered stale, can fail, refresh inline, or 
treat the materialized view as a logical view.
+- Should not consume the storage table as it is when the materialized view 
doesn't meet the freshness criteria.
+
+**Producer behavior:**
+
+Producers should provide the necessary information in the [refresh 
state](#refresh-state) such that consumers can verify the logical equivalence 
of the precomputed data with the query definition.
+Different producers may have different freshness interpretations, based on how 
much of the dependency graph must be current.
+Some require the entire query tree to be fully up to date, while others only 
require direct children or leaf nodes.
+
+When writing the refresh state, producers:
+- Should provide a sufficient list of source states such that consumers can 
determine freshness according to the producer's interpretation.
+- May leave the source states list empty if the source state cannot be 
determined for all objects (for example, for non-Iceberg tables).
+- Must store the entry with the oldest snapshot-id or version-id when the same 
source object is reachable through multiple paths in the dependency graph 
(diamond dependency pattern).

Review Comment:
   this reads a bit too strict almost like:
       "must store records if conflicts present and conflict resolution must 
follow the rules ..."   
     ( which is a bit in conflict with the choice not to store things at all) 
   
   may be rephrase sentence as: 
     " if a stored entry object  is reachable through multiple paths in the 
dependency graph (diamond dependency pattern) then must store the entry with 
the oldest snapshot-id" 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to