amogh-jahagirdar commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1731433443
########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. Review Comment: Recommend taking a look at the other API summaries but typically in the summary you want to describe the purpose of the API upfront rather than the input/output which can follow subsequently (or in the description itself). To someone reading the summary they're not really going to understand the output PlanTask/FileScanTask until they get into the description below anyways. So for this, I'd recommend a simple summary as: "Plan a table scan" ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. Review Comment: Wording nit so it reads more "parallel": "Clients must specify one of either a `snapshot-id` for a snapshot scan or a `start-snapshot-id` and `end-snapshot-id` for incremental scans." ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. + + Each `PlanTask`, can be used as input in the `RetrieveTasks` API + to request a subset of file scan tasks in a table scan. + This mechanism allows clients to distribute and parallelize the entire table scan planning process. + operationId: planTable + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/PlanTableRequest' + responses: + 200: + $ref: '#/components/responses/PlanTableResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 406: + $ref: '#/components/responses/UnsupportedOperationResponse' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{id}: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + - $ref: '#/components/parameters/id' + + get: + tags: + - Catalog API + summary: Uses a `plan-id` as input to get status of `planTable`. Returns a list of `PlanTask`, a list of `FileScanTask`, or both when plan is complete. + operationId: GetTaskStatus + description: + Gets the status of a plan by using a `plan-id` which can be obtained from `planTable`. + If the plan is not completed, returns a `plan-status` representing the state of the plan. + If the plan is completed, returns a list of `PlanTask` or `FileScanTask`. + Review Comment: Nit: Don't think you need a newline here ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. + + Each `PlanTask`, can be used as input in the `RetrieveTasks` API + to request a subset of file scan tasks in a table scan. + This mechanism allows clients to distribute and parallelize the entire table scan planning process. + operationId: planTable + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/PlanTableRequest' + responses: + 200: + $ref: '#/components/responses/PlanTableResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 406: + $ref: '#/components/responses/UnsupportedOperationResponse' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{id}: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + - $ref: '#/components/parameters/id' + + get: + tags: + - Catalog API + summary: Uses a `plan-id` as input to get status of `planTable`. Returns a list of `PlanTask`, a list of `FileScanTask`, or both when plan is complete. Review Comment: Same point as above, the summary should just be a description of what the API is imo. In this case, I think "Gets the status of a scan plan" is sufficient. ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. Review Comment: "which can be used as input in GetTasksStatus" ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. + + Each `PlanTask`, can be used as input in the `RetrieveTasks` API + to request a subset of file scan tasks in a table scan. + This mechanism allows clients to distribute and parallelize the entire table scan planning process. + operationId: planTable + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/PlanTableRequest' + responses: + 200: + $ref: '#/components/responses/PlanTableResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 406: + $ref: '#/components/responses/UnsupportedOperationResponse' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{id}: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + - $ref: '#/components/parameters/id' + + get: + tags: + - Catalog API + summary: Uses a `plan-id` as input to get status of `planTable`. Returns a list of `PlanTask`, a list of `FileScanTask`, or both when plan is complete. + operationId: GetTaskStatus + description: + Gets the status of a plan by using a `plan-id` which can be obtained from `planTable`. + If the plan is not completed, returns a `plan-status` representing the state of the plan. + If the plan is completed, returns a list of `PlanTask` or `FileScanTask`. + + If an invalid `plan-id` is provided, the service will return a `404 NoSuchPlanIdError` exception indicating that the requested plan does not exist. + responses: + 200: + $ref: '#/components/responses/GetTasksStatusResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchPlanIdException, the plan-id does not exist + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + PlanIdDoesNotExist: + $ref: '#/components/examples/NoSuchPlanIdError' + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + delete: + tags: + - Catalog API + summary: Uses a `plan-id` as input to cancel a `planTable` operation. + operationId: CancelPlan + description: + Uses a `plan-id` as input to cancel a `planTable` operation. + If successful, should return a state that says "cancelled". + + If an invalid `plan-id` is provided, the service will return a `404 NoSuchPlanIdError` exception indicating that the requested plan does not exist. Review Comment: I think there are other failure cases for this API where there's an invalid state change for the current plan task -> "Cancelled -> Cancelled" "Failed -> Cancelled" There should probably be some 4xx error for this. Maybe the first one doesn't necessarily need to be a failure case either but definitely the second one (It doesn't make sense to cancel a failed plan). ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. + + Each `PlanTask`, can be used as input in the `RetrieveTasks` API + to request a subset of file scan tasks in a table scan. + This mechanism allows clients to distribute and parallelize the entire table scan planning process. + operationId: planTable + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/PlanTableRequest' + responses: + 200: + $ref: '#/components/responses/PlanTableResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 406: + $ref: '#/components/responses/UnsupportedOperationResponse' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{id}: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + - $ref: '#/components/parameters/id' + + get: + tags: + - Catalog API + summary: Uses a `plan-id` as input to get status of `planTable`. Returns a list of `PlanTask`, a list of `FileScanTask`, or both when plan is complete. + operationId: GetTaskStatus + description: + Gets the status of a plan by using a `plan-id` which can be obtained from `planTable`. + If the plan is not completed, returns a `plan-status` representing the state of the plan. + If the plan is completed, returns a list of `PlanTask` or `FileScanTask`. + + If an invalid `plan-id` is provided, the service will return a `404 NoSuchPlanIdError` exception indicating that the requested plan does not exist. + responses: + 200: + $ref: '#/components/responses/GetTasksStatusResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchPlanIdException, the plan-id does not exist + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + PlanIdDoesNotExist: + $ref: '#/components/examples/NoSuchPlanIdError' + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + delete: + tags: + - Catalog API + summary: Uses a `plan-id` as input to cancel a `planTable` operation. Review Comment: Same as above, I think "Cancel a plan table operation" is sufficient for the summary ########## open-api/rest-catalog-open-api.yaml: ########## @@ -541,6 +541,216 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + post: + tags: + - Catalog API + summary: Returns either a list of `PlanTask`, a list of `FileScanTask`. If planning is not complete returns a `plan-id`. + description: + Prepares either a list of `PlanTask` that can be used to distribute table scan planning, or a list of `FileScanTask`, based on a set of table scan criteria + such as selected columns, filters, snapshot range, case sensitivity, etc. + In the event that the plan tasks or file scan tasks are not ready to be served, the service will return a `plan-id`, + which can be used as input in the `GetTasksStatus`. + + Requires that the client specifies only a `snapshot-id` for a snapshot scan, or for performing incremental scans only provide + a `start-snapshot-id` and an `end-snapshot-id`. + + Each `PlanTask`, can be used as input in the `RetrieveTasks` API + to request a subset of file scan tasks in a table scan. + This mechanism allows clients to distribute and parallelize the entire table scan planning process. + operationId: planTable + requestBody: + content: + application/json: + schema: + $ref: '#/components/schemas/PlanTableRequest' + responses: + 200: + $ref: '#/components/responses/PlanTableResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 406: + $ref: '#/components/responses/UnsupportedOperationResponse' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{id}: + parameters: + - $ref: '#/components/parameters/prefix' + - $ref: '#/components/parameters/namespace' + - $ref: '#/components/parameters/table' + - $ref: '#/components/parameters/id' + + get: + tags: + - Catalog API + summary: Uses a `plan-id` as input to get status of `planTable`. Returns a list of `PlanTask`, a list of `FileScanTask`, or both when plan is complete. + operationId: GetTaskStatus + description: + Gets the status of a plan by using a `plan-id` which can be obtained from `planTable`. + If the plan is not completed, returns a `plan-status` representing the state of the plan. + If the plan is completed, returns a list of `PlanTask` or `FileScanTask`. + + If an invalid `plan-id` is provided, the service will return a `404 NoSuchPlanIdError` exception indicating that the requested plan does not exist. + responses: + 200: + $ref: '#/components/responses/GetTasksStatusResponse' + 400: + $ref: '#/components/responses/BadRequestErrorResponse' + 401: + $ref: '#/components/responses/UnauthorizedResponse' + 403: + $ref: '#/components/responses/ForbiddenResponse' + 404: + description: + Not Found + - NoSuchPlanIdException, the plan-id does not exist + - NoSuchTableException, the table does not exist + - NoSuchNamespaceException, the namespace does not exist + content: + application/json: + schema: + $ref: '#/components/schemas/IcebergErrorResponse' + examples: + PlanIdDoesNotExist: + $ref: '#/components/examples/NoSuchPlanIdError' + TableDoesNotExist: + $ref: '#/components/examples/NoSuchTableError' + NamespaceDoesNotExist: + $ref: '#/components/examples/NoSuchNamespaceError' + 419: + $ref: '#/components/responses/AuthenticationTimeoutResponse' + 503: + $ref: '#/components/responses/ServiceUnavailableResponse' + 5XX: + $ref: '#/components/responses/ServerErrorResponse' + + delete: + tags: + - Catalog API + summary: Uses a `plan-id` as input to cancel a `planTable` operation. + operationId: CancelPlan + description: + Uses a `plan-id` as input to cancel a `planTable` operation. + If successful, should return a state that says "cancelled". Review Comment: Does this API need to return a state? I feel like a 204 status code is sufficient to indicate the cancellation was successful. And we know that for this API the only valid state change that can happen is "Started -> Cancelled".. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org