rdblue commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1520451266


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -537,6 +537,108 @@ paths:
         5XX:
           $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+    parameters:
+      - $ref: '#/components/parameters/prefix'
+      - $ref: '#/components/parameters/namespace'
+      - $ref: '#/components/parameters/table'
+    post:
+      tags:
+        - Catalog API
+      summary: Find plan-tasks based on a plan context.
+      description:
+        Scan pre-planning creates a set of opaque planning tasks for a set of 
scan configuration options. 
+        Each task can be passed to the plan endpoint to fetch a (disjoint) 
subset of the file scan tasks for the scan.
+
+        Scan pre-planning enables breaking scan planning across multiple 
tasks. 
+        This can be used to parallelize scan planning requests, use fewer 
resources in each planning request,
+        or to delay parts of scan planning that may not be needed.

Review Comment:
   I liked some of the information in the last version:
   * Plan tasks are opaque
   * The plan tasks are expected to produce a disjoint subset of the file scan 
tasks (no overlap between plan tasks!)
   * Plan tasks can reduce resources required for planning requests
   * Plan tasks can be used to delay requests for more tasks, in case they are 
not needed.
   
   We don't necessarily need all of that, but I think there's still value 
there. I'd also lean toward giving more context for requirements like that plan 
tasks produce disjoint subsets.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -537,6 +537,104 @@ paths:
         5XX:
           $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+    parameters:
+      - $ref: '#/components/parameters/prefix'
+      - $ref: '#/components/parameters/namespace'
+      - $ref: '#/components/parameters/table'
+    post:
+      tags:
+        - Catalog API
+      summary: Prepare a list of plan tasks that can be used later for table 
scan planning
+      description:
+        Prepare a list of plan tasks that can be used later for table scan 
planning.
+        Each plan task in the response of this API can be used as the 
`plan-task` in the `PlanTable` API request to perform scan planning against a 
subset of the table files.
+        This can be used to parallelize and distribute table scan planning.

Review Comment:
   Can you wrap these lines and make them separate paragraphs??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to