rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1692062862
########## open-api/rest-catalog-open-api.yaml: ########## @@ -3647,6 +3818,176 @@ components: type: integer description: "List of equality field IDs" + PreplanTableRequest: + type: object + required: + - table-scan-context + properties: + table-scan-context: + $ref: '#/components/schemas/TableScanContext' + + PlanTableRequest: + type: object + required: + - table-scan-context + properties: + table-scan-context: + $ref: '#/components/schemas/TableScanContext' + plan-task: + $ref: '#/components/schemas/PlanTask' + stats-fields: + description: + A list of fields that the client requests the server to send statistics + in each `FileScanTask` returned in the response + type: array + items: + $ref: '#/components/schemas/FieldName' + + TableScanContext: + anyOf: + - $ref: '#/components/schemas/SnapshotScanContext' + - $ref: '#/components/schemas/IncrementalSnapshotScanContext' + + BaseTableScanContext: + discriminator: + propertyName: type + mapping: + snapshot-scan: '#/components/schemas/SnapshotScanContext' + incremental-snapshot-scan: '#/components/schemas/IncrementalSnapshotScanContext' + type: object + required: + - type + properties: + type: + type: string + + SnapshotScanContext: + description: context for scanning data in a specific snapshot + type: object + allOf: + - $ref: '#/components/schemas/BaseTableScanContext' + required: + - type + properties: + type: + type: string + enum: ["snapshot-scan"] + select: + $ref: '#/components/schemas/SelectedFieldNames' + filter: + $ref: '#/components/schemas/Filter' + case-sensitive: + description: If field selection and filtering should be case sensitive + type: boolean + default: true + snapshot-id: + description: + The ID of the snapshot to use for the table scan. + If not specified, the snapshot at the main branch head will be used. + type: integer + format: int64 + use-snapshot-schema: + description: + If the schema of the specific snapshot should be used instead of the table schema. + type: boolean + default: false + + IncrementalSnapshotScanContext: + description: + Context for scanning data appended in a range of snapshots. + The scan always follows the schema of the snapshot at the main branch head. + type: object + allOf: + - $ref: '#/components/schemas/BaseTableScanContext' + required: + - type + - start-snapshot-id + properties: + type: + type: string + enum: ["incremental-snapshot-scan"] + select: + $ref: '#/components/schemas/SelectedFieldNames' + filter: + $ref: '#/components/schemas/Filter' + case-sensitive: + description: If field selection and filtering should be case sensitive + type: boolean + default: true + start-snapshot-id: + description: The ID of the starting snapshot of the incremental scan + type: integer + format: int64 + inclusive-start: + description: If the data appended in the start snapshot should be included in the scan + type: boolean + default: false + end-snapshot-id: + description: + The ID of the inclusive ending snapshot of the incremental scan. + If not specified, the snapshot at the main branch head will be used as the end snapshot. + type: integer + format: int64 + + FieldName: + description: + A field name that follows the Iceberg naming standard, and can be used in APIs like + Java `Schema#findField(String name)`. + + The nested field name follows these rules + - nested struct fields are named by concatenating field names at each struct level using dot (`.`) delimiter, + e.g. employer.contact_info.address.zip_code + - nested fields in a map key are named using the keyword `key`, e.g. employee_address_map.key.first_name + - nested fields in a map value are named using the keyword `value`, e.g. employee_address_map.value.zip_code + - nested fields in a list are named using the keyword `element`, e.g. employees.element.first_name + type: string + + SelectedFieldNames: + description: + A list of fields in schema that are selected in a table scan. + When not specified, all columns in the requested schema should be selected. + type: array + items: + $ref: '#/components/schemas/FieldName' + + Filter: Review Comment: There were a couple of uses of `Filter`, but we already have an expression type and I don't see a reason to make a subtype that has a default. It is up to the request that uses an expression whether an expression is optional and what the semantics are if an expression is not included. When using an expression in the scan context's `filter`, then if there _is not filter_ I think the behavior is fairly clear (do not filter). We can explicitly state that as well if it anyone thinks that it is needed, but I don't think that we need to require that clients send `"filter": {"type": "true"}`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org