rdblue commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1645215603


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2838,6 +2988,69 @@ components:
           additionalProperties:
             type: string
 
+    PreplanTableRequest:
+      type: object
+      required:
+        - select
+        - filter
+      properties:
+        select:
+          description: A list of the selected column names. If referencing a 
column with a nested field, ensure full name is joined by `.` such as `a.b`
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        snapshot-id:
+          description: Defaults to the table's current snapshot; This option 
is not allowed when `snapshot-range` is present in the request.
+          type: integer
+          format: int64
+        snapshot-range:
+          description: A JSON list containing exactly 2 snapshot IDs 
representing the start (exclusive) and end (inclusive) snapshots. This option 
is not allowed when `snapshot-id` is present in the request.
+          type: array
+          items:
+            type: integer
+            format: int64
+
+    PlanTableRequest:
+      type: object
+      required:
+        - select
+      properties:
+        select:
+          description: A list of the selected column names
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        stats-fields:
+          description: A list of field names for which stats should be included
+          type: array
+          items:
+            type: string
+        plan-task:
+          $ref: '#/components/schemas/PlanTask'
+        snapshot-id:

Review Comment:
   @danielcweeks, this was my suggestion because I think it makes the API 
simpler. Rather than having several different options and having a more 
complicated spec, we can collapse all of those into the snapshot the client 
actually resolves.
   
   That also avoids cases where the client ends up reading a different snapshot 
than it thinks is current.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3789,6 +4003,21 @@ components:
             EmptyResponse:
               $ref: '#/components/examples/ListNamespacesEmptyExample'
 
+    PayloadTooLargeWithPlanResponse:
+      description:
+        Payload is too large. Required to use /preplan before /plan
+      content:
+        application/json:
+          schema:
+            $ref: '#/components/schemas/IcebergErrorResponse'
+          example: {
+            "error": {
+              "message": " Cannot plan scan: too many task results; use 
/preplan to split planning across multiple requests",
+              "type": "PayloadTooLargeWithPlanException",
+              "code": 413

Review Comment:
   I suggested either 413 or 422, but you're right. 422 is the better option.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -532,6 +532,100 @@ paths:
         5XX:
           $ref: '#/components/responses/ServerErrorResponse'
 
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/preplan:
+    parameters:
+      - $ref: '#/components/parameters/prefix'
+      - $ref: '#/components/parameters/namespace'
+      - $ref: '#/components/parameters/table'
+    post:
+      tags:
+        - Catalog API
+      summary: Find plan-tasks based on a plan context.
+      description:
+        When a user submits a query, this operation will find the relevant 
plan-tasks
+        based on the user's selected columns, and filters. The plan-tasks can 
be later used during PlanTable
+        to distribute this work for performance gain.
+      operationId: PreplanTable
+      requestBody:
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/PreplanTableRequest'
+      responses:
+        200:
+          $ref: '#/components/responses/PreplanTableResponse'
+        400:
+          $ref: '#/components/responses/BadRequestErrorResponse'
+        401:
+          $ref: '#/components/responses/UnauthorizedResponse'
+        403:
+          $ref: '#/components/responses/ForbiddenResponse'
+        404:
+          description:
+            Not Found
+            - NoSuchTableException, the table does not exist
+            - NoSuchNamespaceException, the namespace does not exist
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/ErrorModel'
+              examples:
+                TableDoesNotExist:
+                  $ref: '#/components/examples/NoSuchTableError'
+                NamespaceDoesNotExist:
+                  $ref: '#/components/examples/NoSuchNamespaceError'
+        419:
+          $ref: '#/components/responses/AuthenticationTimeoutResponse'
+        503:
+          $ref: '#/components/responses/ServiceUnavailableResponse'
+        5XX:
+          $ref: '#/components/responses/ServerErrorResponse'
+
+  /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/:
+    parameters:
+      - $ref: '#/components/parameters/prefix'
+      - $ref: '#/components/parameters/namespace'
+      - $ref: '#/components/parameters/table'
+    post:
+      tags:
+        - Catalog API
+      summary: Initiate plan on table to find relevant file scan tasks for a 
specific query.
+      operationId: PlanTable
+      description:
+        When a user submits a query, we can do a plan to find the relevant 
file scan tasks
+        based on the user's selected columns, and filters. For enhanced query 
performance, 
+        users can provide a plan-task which will distribute the planning work.
+      requestBody:
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/PlanTableRequest'
+      responses:

Review Comment:
   This is relevant for a 
[comment](https://github.com/apache/iceberg/pull/9695#discussion_r1641334986) 
from @danielcweeks below. Looks like 413 isn't the right response because it 
refers to the size of the request.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2838,6 +2978,63 @@ components:
           additionalProperties:
             type: string
 
+    PreplanTableRequest:
+      type: object
+      required:
+        - select
+        - filter
+        - case-sensitive
+      properties:
+        select:
+          description: A list of the selected column names
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+        snapshot-id:
+          description: an int64 snapshot ID (if snapshot-range is not 
present); optional and defaults to the table's current snapshot
+          type: integer
+          format: int64
+        snapshot-range:
+          description: a JSON list containing exactly 2 int64 snapshot IDs (if 
snapshot-id is not present) representing the start (defaults to exclusive, see 
`start-exclusive`) and end (only inclusive) snapshots
+          type: array
+          items:
+            type: integer
+            format: int64
+        start-exclusive:
+          description: Indicates whether the user wants the start of 
`snapshot-range` to be inclusive or exclusive
+          type: boolean
+          default: true
+
+    PlanTableRequest:
+      type: object
+      required:
+        - select
+        - filter
+        - case-sensitive
+      properties:
+        select:
+          description: A list of the selected column names

Review Comment:
   Look good. Thank you!



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2838,6 +2988,69 @@ components:
           additionalProperties:
             type: string
 
+    PreplanTableRequest:
+      type: object
+      required:
+        - select
+        - filter
+      properties:
+        select:
+          description: A list of the selected column names. If referencing a 
column with a nested field, ensure full name is joined by `.` such as `a.b`
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        snapshot-id:
+          description: Defaults to the table's current snapshot; This option 
is not allowed when `snapshot-range` is present in the request.
+          type: integer
+          format: int64
+        snapshot-range:
+          description: A JSON list containing exactly 2 snapshot IDs 
representing the start (exclusive) and end (inclusive) snapshots. This option 
is not allowed when `snapshot-id` is present in the request.
+          type: array
+          items:
+            type: integer
+            format: int64
+
+    PlanTableRequest:
+      type: object
+      required:
+        - select
+      properties:
+        select:
+          description: A list of the selected column names
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        stats-fields:
+          description: A list of field names for which stats should be included
+          type: array
+          items:
+            type: string
+        plan-task:
+          $ref: '#/components/schemas/PlanTask'
+        snapshot-id:
+          description: defaults to the table's current snapshot. This option 
is not allowed when `snapshot-range` or 'plan-task` is present in the request.
+          type: integer
+          format: int64
+
+        snapshot-range:
+          description: A JSON list containing exactly 2 snapshot IDs 
representing the start (exclusive) and end (inclusive) snapshots. This option 
is not allowed when `snapshot-id` or `plan-task` is present in the request.
+          type: array

Review Comment:
   This is something we could do. I suggested the range array but I called out 
that we may just want to have start and end: 
https://github.com/apache/iceberg/pull/9695#discussion_r1494914340
   
   If you prefer the other option then let's go with that.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2838,6 +2988,69 @@ components:
           additionalProperties:
             type: string
 
+    PreplanTableRequest:
+      type: object
+      required:
+        - select
+        - filter
+      properties:
+        select:
+          description: A list of the selected column names. If referencing a 
column with a nested field, ensure full name is joined by `.` such as `a.b`
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        snapshot-id:
+          description: Defaults to the table's current snapshot; This option 
is not allowed when `snapshot-range` is present in the request.
+          type: integer
+          format: int64
+        snapshot-range:
+          description: A JSON list containing exactly 2 snapshot IDs 
representing the start (exclusive) and end (inclusive) snapshots. This option 
is not allowed when `snapshot-id` is present in the request.
+          type: array
+          items:
+            type: integer
+            format: int64
+
+    PlanTableRequest:
+      type: object
+      required:
+        - select
+      properties:
+        select:
+          description: A list of the selected column names
+          type: array
+          items:
+            type: string
+        filter:
+          $ref: '#/components/schemas/Expression'
+        case-sensitive:
+          description: Indicates whether column selection and filtering should 
be case sensitive
+          type: boolean
+          default: true
+        stats-fields:
+          description: A list of field names for which stats should be included
+          type: array
+          items:
+            type: string
+        plan-task:
+          $ref: '#/components/schemas/PlanTask'
+        snapshot-id:

Review Comment:
   Here's the context: 
https://github.com/apache/iceberg/pull/9695#discussion_r1494914340



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to