rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1520444991
########## open-api/rest-catalog-open-api.yaml: ########## @@ -2068,6 +2162,145 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' + PlanTask: + description: + A JSON object that contains information provided by the server, + to be utilized by clients for distributed planning, should be supplied + as is for input in PlanTable operation. + type: object + + FileScanTask: + type: object + required: + - schema + - spec + - start + - length + - data-file + properties: + data-file: + $ref: '#/components/schemas/ContentFile' + partition: + type: object + additionalProperties: + type: string + size-bytes: + type: number + start: + type: number + length: + type: number + estimated-rows-count: + type: number + delete-files: + type: array + items: + $ref: '#/components/schemas/ContentFile' + schema: + $ref: '#/components/schemas/Schema' Review Comment: Yeah, if this is based on what gets serialized to workers in Java frameworks, we don't need it. Those tasks send the table schema and spec to be able to work with the partition tuple on the task side. But this use case assumes that the caller has access to the table. If we wanted it to be possible for the caller to not load the table (which we may choose to do in a later update to this API) then we would send this metadata once per request rather than on each task. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org