nastra commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1490871701
########## format/spec.md: ########## @@ -1237,17 +1237,37 @@ Content file (data or delete) is serialized as a JSON object according to the fo | **`equality-ids`** |`JSON list of int: Field ids used to determine row equality in equality delete files`|`[1]`| | **`sort-order-id`** |`JSON int`|`1`| -### File Scan Task Serialization - -File scan task is serialized as a JSON object according to the following table. - -| Metadata field |JSON representation|Example| -|--------------------------|--- |--- | -| **`schema`** |`JSON object`|`See above, read schemas instead`| -| **`spec`** |`JSON object`|`See above, read partition specs instead`| -| **`data-file`** |`JSON object`|`See above, read content file instead`| -| **`delete-files`** |`JSON list of objects`|`See above, read content file instead`| -| **`residual-filter`** |`JSON object: residual filter expression`|`{"type":"eq","term":"id","value":1}`| +### Task Serialization + +There could be different implementations of tasks, +e.g., `BaseFileScanTask` and `StaticDataTask` in Java. +A enum `task-type` field is needed to distinguish different task types. + +| Metadata field | JSON representation | Example | +|-----------------|---------------------|------------------------------------------------------------------------------------------------| +| **`task-type`** | `JSON string` | `file-scan-task`, `data-task`. Absence of this field should be interpreted as `file-scan-task` | + +`file-scan-task` represents a scan task with a data file and maybe delete files. +It is serialized according to the following table (in addition to the `task-type` field). + +| Metadata field |JSON representation|Example| +|------------------------|--- |--- | +| **`schema`** |`JSON object`|`See above, read schemas instead`| Review Comment: I just realized that I'm not clear on what the example `See above, read schemas instead` is trying to tell the reader. Should this maybe say `See "schemas" section above`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org