rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1688800465
########## open-api/rest-catalog-open-api.yaml: ########## @@ -2774,6 +2920,30 @@ components: additionalProperties: type: string + PreplanTableResult: + type: object + required: + - plan-tasks + properties: + plan-tasks: + type: array + items: + $ref: '#/components/schemas/PlanTask' + next-page-token: + $ref: '#/components/schemas/PageToken' Review Comment: @rdblue So before this pagination change was made to the protocol, the assumption we were making was that for very large tables we can create multiple shards (plan tasks) and then for each plan task retrieve the list of file scan tasks. As number of shards grew this would make the number of file scan tasks per shard a reasonable number so that pagination would not be needed. However I think there are cases where you may not be able to create this many plan tasks therefore the number of file scan tasks per shard will be very large. Thus you will need to introduce pagination at the very least for `plan` table. I think `preplan` I agree maybe it might be overkill to have pagination for the list of `PlanTask`, but would like to know what others think @jackye1995 @danielcweeks @amogh-jahagirdar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org