singhpk234 commented on code in PR #13879:
URL: https://github.com/apache/iceberg/pull/13879#discussion_r2617241056
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3265,6 +3265,136 @@ components:
additionalProperties:
type: string
+ ReadRestrictions:
+ type: object
+ description: >
+ Read restrictions for a table, including projections and row filter
expressions, according to the current schema.
+
+ A client MUST enforce the restrictions defined in this object when
reading data
+ from the table.
+
+ These restrictions apply only to the authenticated principal, user,
or account
+ associated with the client. They MUST NOT be interpreted as global
policy and
+ MUST NOT be applied beyond the entity identified by the
Authentication header
+ (or other applicable authentication mechanism).
+ properties:
+ required-projections:
+ description: >
+ A list of projections that MUST be applied prior to any
query-specified
+ projections.
+ If the required-projection property is absent, no mandatory
projection applies,
+ and a reader MAY project any subset of columns of the table,
including all columns.
+
+ 1. A reader MUST project only columns listed in the
required-projection.
+ - If a listed column has a transform, the reader MUST apply it
and replace
+ all references to the underlying column with the transformed
value
+ (for example, truncate[4](cc) MUST be projected as
truncate[4](cc) AS cc,
+ and all references to cc during query evaluation post applying
required-row-filter MUST resolve to this alias).
+ - Columns not listed in the required-projection MUST NOT be read.
+
+ 2. A column MUST appear at most once in the required-projection.
+
+ 3. Multiple transformed versions of the same column (e.g.,
truncate[5](col)
+ and truncate[3](col) MUST NOT appear in the required-projection.
+
+ 4. If a projection entry includes an action that the reader cannot
evaluate,
+ the reader MUST fail rather than ignore the transform.
+
+ 5. An identity transform is equivalent to projecting the column
directly.
+
+ 8. The data type of the projected column MUST match the data type
defined for the transform result.
+
+ type: array
+ items:
+ $ref: '#/components/schemas/Projection'
+ required-row-filter:
+ description: >
+ An expression that filters rows in the table.
+
+ 1. A reader MUST discard any row for which the filter evaluates to
false or null, and
+ no information derived from discarded rows MAY be included in
the query result.
+
+ 2. Row filters MUST be evaluated against the original,
untransformed column values.
+ Required projections MUST be applied only after row filters are
applied.
+
+ 3. If the catalog supports multiple row access filters for the
table, it is
+ the catalog's responsibility to combine them using the
appropriate logic (e.g., AND, OR).
+
+ 4. If a client cannot interpret or evaluate a provided filter
expression, it MUST fail.
+
+ 5. If the required-row-filter property is absent or empty, no
mandatory filtering is imposed.
+ $ref: '#/components/schemas/Expression'
+
+ Projection:
+ type: object
+ description: Defines a projection for a column.
+ properties:
+ source-id:
+ type: int
+ description: field id of the column being projected.
+ action:
+ $ref: '#/components/schemas/Action'
+ required:
+ - source-id
+ - action
+
+ Action:
+ description: Defines the specific action to be executed for computing
the projection.
+ oneOf:
+ - $ref: '#/components/schemas/MaskHashSha256'
+ - $ref: '#/components/schemas/MaskReplaceWithNull'
+ - $ref: '#/components/schemas/MaskAlphanumeric'
Review Comment:
> There's definition of what are valid transform function names in this
specification, so is there any reason we could not add those functions there
instead
This is something we debated a lot too, introducing new transform function
vs just introducing an Action and defining tranform for them seemed like an
overkill at that point just putting it in the spec and let engines use their
own specific impl specially for case like sha256 seemed pretty reasonable.
> So is the function defined for both string and binary (and for binary, the
binary representation of the hash is returned then?). Does the function returns
null if the input is null?
precisely the return type should be same post transformation so if a binary
it should be binary and same goes for null. please let me know if we wanna be
explicit about happy to pen it down to spec.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]