rahil-c commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1688800465


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2774,6 +2920,30 @@ components:
           additionalProperties:
             type: string
 
+    PreplanTableResult:
+      type: object
+      required:
+        - plan-tasks
+      properties:
+        plan-tasks:
+          type: array
+          items:
+            $ref: '#/components/schemas/PlanTask'
+        next-page-token:
+          $ref: '#/components/schemas/PageToken'

Review Comment:
   @rdblue 
   So before this pagination change was made to the protocol, the assumption we 
were making was that for very large tables we can create multiple shards (plan 
tasks) and then for each plan task retrieve the list of file scan tasks. As 
number of shards grew this would make the number of file scan tasks per shard a 
reasonable number so that pagination would not be needed.
   
   However I think there are cases where you may not be able to create this 
many plan tasks therefore the number of file scan tasks per shard will be very 
large. Thus you will need to introduce pagination at the very least for `plan` 
table. 
   
   I think `preplan` I agree maybe it might be overkill to have pagination for 
the list of `PlanTask`, but would like to know what others think @jackye1995 
@danielcweeks @amogh-jahagirdar 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to