JanKaul commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1870193131
########## format/view-spec.md: ########## @@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when the view's `current-version-id` was updated (ms from epoch) | | _required_ | `version-id` | ID that `current-version-id` was set to | +#### Full identifier + +The full identifier holds a reference, containing a namespace and a name, of a table or view in the catalog. + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _optional_ | `catalog` | A string specifying the name of the catalog. If set to `null`, the catalog is the same as the views' catalog | +| _required_ | `namespace` | A list of namespace levels | +| _required_ | `name` | A string specifying the name of the table/view | + +### Materialized View Metadata stored as part of the Table Metadata + +A property "refresh-state" is set on the table [snapshot summary](https://iceberg.apache.org/spec/#snapshots) to determine the freshness of the precomputed data of the storage table. + +| Requirement | Field name | Description | +|-------------|-----------------|-------------| +| _required_ | `refresh-state` | A [refresh state](#refresh-state) record stored as a JSON-encoded string | + +#### Refresh state + +The refresh state record captures the state of all source tables and source views in the fully expanded query tree of the materialized view, including indirect references. Indirect references are the tables/views that are not directly referenced in the query but are nested within other views. The refresh state has the following fields: + +| Requirement | Field name | Description | +|-------------|----------------|-------------| +| _required_ | `refresh-version-id` | The `version-id` of the materialized view when the refresh operation was performed | +| _required_ | `source-table-states` | A list of [source table](#source-table) records for all tables that are directly or indirectly referenced in the materialized view query | +| _required_ | `source-view-states` | A list of [source view](#source-view) records for all views that are directly or indirectly referenced in the materialized view query | +| _required_ | `refresh-start-timestamp-ms` | A timestamp of when the refresh operation was started | + +#### Source table + +A source table record captures the state of a source table at the time of the last refresh operation. + +| Requirement | Field name | Description | Review Comment: As I said on the mailing list, I don't have a strong opinion about this issue. I was hesitant to add it because I wanted to resolve the discussion with Walaa first. I think he might be skeptical about including the table identifier here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org