stevenzwu commented on code in PR #13879:
URL: https://github.com/apache/iceberg/pull/13879#discussion_r3211673037


##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data

Review Comment:
   Normative actor terminology drifts across this section: "client" here on 
line 3488, "reader" on lines 3520/3523/3528/3548, "engine" on line 3613 
(`MaskToFixedValue`) and line 3754 (`Sha256QueryLocal`). For a normative spec, 
picking one term and using it everywhere — "reader" is probably the cleanest, 
since this attaches to read-side behavior — would make the MUST/SHOULD 
requirements easier to track for implementers.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.

Review Comment:
   This global rule conflicts with two of the actions defined below:
   
   - `mask-to-fixed-value` (line 3610): the action's purpose is to replace 
values with a constant, but NULL would pass through unmasked, which both 
contradicts the action's intent and can leak the existence of NULL — sometimes 
itself sensitive information.
   - `apply-expression` (line 3768): an arbitrary expression like 
`coalesce(col, 'unknown')` will produce non-NULL output from NULL input; the 
rule above forbids that.
   
   Suggest dropping the global rule and stating NULL-input behavior per-action 
instead. `replace-with-null` and the SHA-256 / truncate / mask-alphanum / 
show-first-4 / show-last-4 actions can keep the "NULL in ⇒ NULL out" guarantee 
in their individual descriptions; `mask-to-fixed-value` and `apply-expression` 
need to define their own behavior.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.

Review Comment:
   A worked JSON example for `ReadRestrictions` in the spec body would help 
readers a lot — this is a non-trivial structure with a discriminated union and 
there is no example here. The one posted as a PR comment (`{ 
"required-column-projections": [ { "field-id": 4, "action": "show-last-4" }, 
... ], "required-row-filter": ... }`) reads well; consider lifting it into the 
description.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean. A reader MUST 
discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.
+
+            2. Row filters MUST be evaluated against the original, 
untransformed column values.
+              Required projections MUST be applied only after row filters are 
applied.
+
+            3. If a client cannot interpret or evaluate a provided filter 
expression, it MUST fail.
+
+            4. If this property is absent, null, or always true then no 
mandatory filtering is required.
+          $ref: '#/components/schemas/Expression'
+
+    Action:
+      discriminator:
+        propertyName: action
+        mapping:
+          mask-alphanum: '#/components/schemas/MaskAlphanum'
+          mask-to-fixed-value: '#/components/schemas/MaskToFixedValue'
+          replace-with-null: '#/components/schemas/ReplaceWithNull'
+          show-first-4: '#/components/schemas/ShowFirst4'
+          show-last-4: '#/components/schemas/ShowLast4'
+          truncate-to-year: '#/components/schemas/TruncateToYear'
+          truncate-to-month: '#/components/schemas/TruncateToMonth'
+          sha-256-global: '#/components/schemas/Sha256Global'
+          sha-256-query-local: '#/components/schemas/Sha256QueryLocal'
+          apply-expression: '#/components/schemas/ApplyExpression'
+      type: object
+      required:
+        - action
+        - field-id
+      properties:
+        action:
+          type: string
+        field-id:
+          type: integer
+          description: field id of the column being projected.
+
+    MaskAlphanum:
+      description: >
+        Redacts the column value Unicode code point by code point using the 
following rules:
+
+        - Digits (U+0030–U+0039, 0-9) are replaced with 'n'
+        - The following punctuation characters are kept as-is:
+            U+0028 '('  LEFT PARENTHESIS
+            U+0029 ')'  RIGHT PARENTHESIS
+            U+002C ','  COMMA
+            U+002E '.'  FULL STOP
+            U+002D '-'  HYPHEN-MINUS
+            U+0040 '@'  COMMERCIAL AT
+        - All other Unicode characters (including letters, whitespace, and any 
punctuation
+          not listed above) are replaced with 'x'
+
+        For example: "[email protected]" → "[email protected]"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-alphanum"
+
+    MaskToFixedValue:
+      description: >
+        Replaces the column value with a predefined type-specific fixed value.
+        Engines MUST use exactly the values listed below to ensure consistency
+        across implementations.
+
+        Fixed values by type:
+        - boolean: false
+        - int: 0
+        - long: 0
+        - float: 0.0
+        - double: 0.0
+        - decimal(p, s): 0 (zero with s digits after the decimal point, e.g. 
0.00 for decimal(p,2))
+        - string: "XXXXXXXX"
+        - date: 1970-01-01
+        - time: 00:00:00
+        - timestamp: 1970-01-01T00:00:00
+        - timestamptz: 1970-01-01T00:00:00+00:00
+        - timestamp_ns: 1970-01-01T00:00:00.000000000
+        - timestamptz_ns: 1970-01-01T00:00:00.000000000+00:00
+        - uuid: 00000000-0000-0000-0000-000000000000
+        - fixed(n): n zero bytes
+        - binary: empty byte sequence
+        - variant: {}
+        - geometry: POINT EMPTY
+        - geography: POINT EMPTY
+        - list: empty list []
+        - map: empty map {}
+        - struct: struct with each field set to its type-specific default 
(applied recursively)
+
+        Applicable to: all data types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-to-fixed-value"
+
+    ReplaceWithNull:
+      description: >
+        Replaces the entire column value with NULL.
+
+        Applicable to: all nullable types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "replace-with-null"
+
+    ShowFirst4:
+      description: >
+        Preserves the first 4 Unicode code points of the column value and 
redacts the remainder
+        using mask-alphanum rules (see MaskAlphanum for the exact character 
rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "[email protected]" → "[email protected]"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-first-4"
+
+    ShowLast4:
+      description: >
+        Redacts all Unicode code points except the last 4 using mask-alphanum 
rules
+        (see MaskAlphanum for the exact character rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "4111-1111-1111-4444" → "nnnn-nnnn-nnnn-4444"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-last-4"
+
+    TruncateToYear:
+      description: >
+        Truncates the column value to year precision, setting month, day, and 
time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-01-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-year"
+
+    TruncateToMonth:
+      description: >
+        Truncates the column value to year and month precision, setting day 
and time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-07-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-month"
+
+    Sha256Global:
+      description: |
+        Applies SHA-256 as specified in NIST FIPS 180-4. Deterministic across 
all queries
+        and engines — the same input always produces the same output.
+
+        Input-to-bytes encoding by type:
+        - string: UTF-8 encoded bytes
+        - int: 4 bytes, little-endian
+        - long: 8 bytes, little-endian
+        - binary: raw bytes as-is
+
+        Output encoding by type:
+        - string: 64-character lowercase hexadecimal string
+        - int: first 4 bytes of the digest, read as a signed two's complement 
little-endian int
+        - long: first 8 bytes of the digest, read as a signed two's complement 
little-endian long
+        - binary: the full 32-byte raw SHA-256 digest
+
+        Applicable to: string, int, long, binary
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "sha-256-global"
+
+    Sha256QueryLocal:
+      description: |
+        Applies SHA-256 with a per-query random salt, making the output 
non-deterministic
+        across queries while remaining consistent within a single query.
+
+        The engine MUST generate a cryptographically random salt of at least 
16 bytes for each query and apply it as:
+          SHA-256(salt_bytes || canonical_bytes)
+        where canonical_bytes follows the same encoding rules as 
sha-256-global.
+
+        Output encoding follows the same rules as sha-256-global.
+
+        Applicable to: string, int, long, binary
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "sha-256-query-local"
+
+    ApplyExpression:
+      description: >
+        Replace the field with the result of an expression. Produce the 
original field name
+        with the expression result.
+
+        Applicable to: all data types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      required:
+        - action
+        - expression
+      properties:
+        action:
+          type: string
+          const: "apply-expression"
+        expression:
+          $ref: '#/components/schemas/Expression'

Review Comment:
   Implementability gap — `apply-expression` referencing other restricted 
columns.
   
   If `col_a`'s action is `apply-expression` over an expression that references 
`col_b`, and `col_b` itself has its own action in the same 
`required-column-projections`, the spec doesn't say whether the expression sees 
raw `col_b` or the transformed `col_b`. Two implementations could diverge here 
without violating any rule, which is exactly the kind of ambiguity that 
produces interop bugs.
   
   Concrete cases that need a defined answer:
   - `col_a` action: `apply-expression` of `length(col_b)`; `col_b` action: 
`mask-to-fixed-value`. If the expression sees raw `col_b`, `length(col_b)` 
returns the actual string length per row; if it sees masked `col_b`, it always 
returns `8` (the length of `"XXXXXXXX"`). Two engines reading the same response 
can produce different output.
   - `col_a` action: `apply-expression` of `col_b` (i.e., aliasing). With 
masking on `col_b`, `col_a` could leak the unmasked value if the expression 
sees raw input.
   
   Suggest one of:
   - Expressions in `apply-expression` MUST evaluate against raw column values 
(consistent with row-filter rule 2).
   - Expressions in `apply-expression` MUST NOT reference any column that has 
its own entry in `required-column-projections`.
   
   The first is more flexible; the second is simpler to validate.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean. A reader MUST 
discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.

Review Comment:
   Three-valued logic still isn't resolved here. 
   ```suggestion
               1. The expression MUST evaluate to a boolean. A reader MUST keep 
only rows for which
                 the filter evaluates to TRUE; rows that evaluate to FALSE or 
NULL MUST be discarded,
                 and no information derived from discarded rows MAY be included 
in the query result.
   ```
   
   This matches SQL `WHERE` semantics and removes ambiguity about NULL/UNKNOWN.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean. A reader MUST 
discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.
+
+            2. Row filters MUST be evaluated against the original, 
untransformed column values.
+              Required projections MUST be applied only after row filters are 
applied.
+
+            3. If a client cannot interpret or evaluate a provided filter 
expression, it MUST fail.
+
+            4. If this property is absent, null, or always true then no 
mandatory filtering is required.
+          $ref: '#/components/schemas/Expression'
+
+    Action:
+      discriminator:
+        propertyName: action
+        mapping:
+          mask-alphanum: '#/components/schemas/MaskAlphanum'
+          mask-to-fixed-value: '#/components/schemas/MaskToFixedValue'
+          replace-with-null: '#/components/schemas/ReplaceWithNull'
+          show-first-4: '#/components/schemas/ShowFirst4'
+          show-last-4: '#/components/schemas/ShowLast4'
+          truncate-to-year: '#/components/schemas/TruncateToYear'
+          truncate-to-month: '#/components/schemas/TruncateToMonth'
+          sha-256-global: '#/components/schemas/Sha256Global'
+          sha-256-query-local: '#/components/schemas/Sha256QueryLocal'
+          apply-expression: '#/components/schemas/ApplyExpression'
+      type: object
+      required:
+        - action
+        - field-id
+      properties:
+        action:
+          type: string
+        field-id:
+          type: integer
+          description: field id of the column being projected.
+
+    MaskAlphanum:
+      description: >
+        Redacts the column value Unicode code point by code point using the 
following rules:
+
+        - Digits (U+0030–U+0039, 0-9) are replaced with 'n'
+        - The following punctuation characters are kept as-is:
+            U+0028 '('  LEFT PARENTHESIS
+            U+0029 ')'  RIGHT PARENTHESIS
+            U+002C ','  COMMA
+            U+002E '.'  FULL STOP
+            U+002D '-'  HYPHEN-MINUS
+            U+0040 '@'  COMMERCIAL AT
+        - All other Unicode characters (including letters, whitespace, and any 
punctuation
+          not listed above) are replaced with 'x'
+
+        For example: "[email protected]" → "[email protected]"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-alphanum"
+
+    MaskToFixedValue:
+      description: >
+        Replaces the column value with a predefined type-specific fixed value.
+        Engines MUST use exactly the values listed below to ensure consistency
+        across implementations.
+
+        Fixed values by type:
+        - boolean: false
+        - int: 0
+        - long: 0
+        - float: 0.0
+        - double: 0.0
+        - decimal(p, s): 0 (zero with s digits after the decimal point, e.g. 
0.00 for decimal(p,2))
+        - string: "XXXXXXXX"
+        - date: 1970-01-01
+        - time: 00:00:00
+        - timestamp: 1970-01-01T00:00:00
+        - timestamptz: 1970-01-01T00:00:00+00:00
+        - timestamp_ns: 1970-01-01T00:00:00.000000000
+        - timestamptz_ns: 1970-01-01T00:00:00.000000000+00:00
+        - uuid: 00000000-0000-0000-0000-000000000000
+        - fixed(n): n zero bytes
+        - binary: empty byte sequence
+        - variant: {}
+        - geometry: POINT EMPTY
+        - geography: POINT EMPTY
+        - list: empty list []
+        - map: empty map {}
+        - struct: struct with each field set to its type-specific default 
(applied recursively)
+
+        Applicable to: all data types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "mask-to-fixed-value"
+
+    ReplaceWithNull:
+      description: >
+        Replaces the entire column value with NULL.
+
+        Applicable to: all nullable types
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "replace-with-null"
+
+    ShowFirst4:
+      description: >
+        Preserves the first 4 Unicode code points of the column value and 
redacts the remainder
+        using mask-alphanum rules (see MaskAlphanum for the exact character 
rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "[email protected]" → "[email protected]"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-first-4"
+
+    ShowLast4:
+      description: >
+        Redacts all Unicode code points except the last 4 using mask-alphanum 
rules
+        (see MaskAlphanum for the exact character rules).
+        Values with 4 or fewer Unicode code points are returned unchanged.
+
+        For example: "4111-1111-1111-4444" → "nnnn-nnnn-nnnn-4444"
+
+        Applicable to: string
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "show-last-4"
+
+    TruncateToYear:
+      description: >
+        Truncates the column value to year precision, setting month, day, and 
time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-01-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-year"
+
+    TruncateToMonth:
+      description: >
+        Truncates the column value to year and month precision, setting day 
and time components
+        to their minimum values. The output type matches the input type.
+
+        For example: 2024-07-15 → 2024-07-01
+        For timestamptz and timestamptz_ns, truncation is performed in UTC.
+
+        Applicable to: date, timestamp, timestamptz, timestamp_ns, 
timestamptz_ns
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "truncate-to-month"
+
+    Sha256Global:
+      description: |
+        Applies SHA-256 as specified in NIST FIPS 180-4. Deterministic across 
all queries
+        and engines — the same input always produces the same output.
+
+        Input-to-bytes encoding by type:
+        - string: UTF-8 encoded bytes
+        - int: 4 bytes, little-endian
+        - long: 8 bytes, little-endian
+        - binary: raw bytes as-is
+
+        Output encoding by type:
+        - string: 64-character lowercase hexadecimal string
+        - int: first 4 bytes of the digest, read as a signed two's complement 
little-endian int
+        - long: first 8 bytes of the digest, read as a signed two's complement 
little-endian long
+        - binary: the full 32-byte raw SHA-256 digest
+
+        Applicable to: string, int, long, binary
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "sha-256-global"
+
+    Sha256QueryLocal:
+      description: |
+        Applies SHA-256 with a per-query random salt, making the output 
non-deterministic
+        across queries while remaining consistent within a single query.
+
+        The engine MUST generate a cryptographically random salt of at least 
16 bytes for each query and apply it as:
+          SHA-256(salt_bytes || canonical_bytes)
+        where canonical_bytes follows the same encoding rules as 
sha-256-global.
+
+        Output encoding follows the same rules as sha-256-global.
+
+        Applicable to: string, int, long, binary
+      allOf:
+        - $ref: '#/components/schemas/Action'
+      properties:
+        action:
+          type: string
+          const: "sha-256-query-local"
+
+    ApplyExpression:

Review Comment:
   should we add this later until Ryan's expression extension work is done? 
right now, the expressions are only boolean.
   
   Compared to the other actions, this description is very thin for what is the 
most general-purpose mechanism. A few things worth specifying explicitly:
   
   1. **Output type derivation**: rule 6 of `required-column-projections` 
mentions this in passing ("for apply-expression, the output type is determined 
by the expression") but the constraint that the output type must be assignable 
to the original column's declared type belongs here.
   2. **Reference scope**: may the expression reference other columns of the 
same row? (Almost certainly yes, but the spec is silent.) May it reference 
unrelated tables, sessions, or constants only? Worth pinning down.
   3. **NULL-input behavior**: see the comment above on the global NULL rule — 
`apply-expression` cannot honor it in general, so this action's description 
should say what happens.
   4. **Determinism**: is the catalog allowed to return non-deterministic 
expressions? `Sha256QueryLocal` makes this explicit; this action does not.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.

Review Comment:
   This rule covers struct subfields but not list elements or map keys/values. 
A list of strings (`list<string>`) where the catalog wants to mask the 
elements, or a `map<string, int>` where keys vs. values may be sensitive, falls 
outside this. A short note clarifying intent — either "actions only target 
named struct fields, never list elements / map entries" or pointing to a 
follow-up — would close the gap.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no

Review Comment:
   `two properties` is unclear here until I read the later part of the schema. 
maybe sth like
   
   ```
   Empty ReadRestrictions object imposes no restrictions and is equivalent to 
...
   ```



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean. A reader MUST 
discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.
+
+            2. Row filters MUST be evaluated against the original, 
untransformed column values.
+              Required projections MUST be applied only after row filters are 
applied.
+
+            3. If a client cannot interpret or evaluate a provided filter 
expression, it MUST fail.
+
+            4. If this property is absent, null, or always true then no 
mandatory filtering is required.
+          $ref: '#/components/schemas/Expression'
+
+    Action:
+      discriminator:
+        propertyName: action
+        mapping:
+          mask-alphanum: '#/components/schemas/MaskAlphanum'
+          mask-to-fixed-value: '#/components/schemas/MaskToFixedValue'
+          replace-with-null: '#/components/schemas/ReplaceWithNull'
+          show-first-4: '#/components/schemas/ShowFirst4'
+          show-last-4: '#/components/schemas/ShowLast4'
+          truncate-to-year: '#/components/schemas/TruncateToYear'
+          truncate-to-month: '#/components/schemas/TruncateToMonth'
+          sha-256-global: '#/components/schemas/Sha256Global'
+          sha-256-query-local: '#/components/schemas/Sha256QueryLocal'
+          apply-expression: '#/components/schemas/ApplyExpression'
+      type: object
+      required:
+        - action
+        - field-id
+      properties:
+        action:
+          type: string
+        field-id:
+          type: integer
+          description: field id of the column being projected.

Review Comment:
   Style nits: capitalize the first word and use the canonical "ID" rendering — 
Iceberg generally writes "ID" rather than "id" in descriptive text:
   
   ```suggestion
             description: Field ID of the column being projected.
   ```



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3480,6 +3480,309 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.
+
+          These restrictions apply only to the authenticated principal, user, 
or account
+          associated with the request. They MUST NOT be interpreted as global 
policy and
+          MUST NOT be applied beyond the entity identified by the 
Authentication header
+          (or other applicable authentication mechanism).
+
+          If both properties are absent or empty, the ReadRestrictions object 
imposes no
+          restrictions and is equivalent to the field being absent from the 
response.
+          A server MUST NOT return an action for a column whose type is not 
listed in
+          that action's "Applicable to" set.
+          For all actions, if the input column value is NULL, the output MUST 
be NULL.
+
+          If a column projection targets a struct-typed field, other column 
projections
+          in the same ReadRestrictions MUST NOT target any of that struct's 
subfields
+          (at any depth). This avoids ambiguity about which action governs a 
given
+          leaf value.
+      properties:
+        required-column-projections:
+          description: >
+            A list of columns that require specific actions to be applied when 
reading.
+
+            If this property is absent, a reader MAY access all columns of the 
table as-is
+            without any mandatory transformations.
+
+            If this property is present, each listed column MUST have its 
specified
+            action applied. Columns not listed in required-column-projections
+            are not subject to any read restrictions.
+
+            When this list is present:
+
+            1. For each column listed in required-column-projections, the 
reader MUST apply
+              the specified action before returning values for that column.
+
+            2. The reader MUST replace all output references to the column 
with the result
+              of the action, presenting the result under the original column 
name. For
+              example, if the action for column cc is mask-alphanum, the 
reader MUST
+              return the masked value as cc in the query output.
+
+            3. Columns not listed in required-column-projections MAY be 
projected normally
+              by the reader without any mandatory transformations.
+
+            4. A column MUST appear at most once in 
required-column-projections.
+
+            5. If a projected column's action cannot be evaluated by the reader
+              (including unrecognized action types), the reader MUST fail 
rather than
+              ignore or skip the action.
+
+            6. Each action defines the output type for its column. For all 
predefined
+              actions except apply-expression, the output type matches the 
input column
+              type. For apply-expression, the output type is determined by the 
expression.
+
+          type: array
+          items:
+            $ref: '#/components/schemas/Action'
+        required-row-filter:
+          description: >
+            An expression that filters rows in the table that the 
authenticated principal does not have access to.
+
+            1. The expression MUST evaluate to a boolean. A reader MUST 
discard any row for which
+              the filter evaluates to FALSE, and no information derived from 
discarded rows
+              MAY be included in the query result.
+
+            2. Row filters MUST be evaluated against the original, 
untransformed column values.
+              Required projections MUST be applied only after row filters are 
applied.
+
+            3. If a client cannot interpret or evaluate a provided filter 
expression, it MUST fail.

Review Comment:
   Implementability — clarify what "MUST fail" means.
   
   Both projection rule 5 (line 3535 above) and row-filter rule 3 here use 
"MUST fail" without defining the failure mode. For a security feature this 
needs to be fail-close — refuse the query and return an error to the caller — 
never fail-open or fail-silent (return empty results, or skip the action and 
return raw values). Worth stating explicitly:
   
   ```suggestion
               3. If a client cannot interpret or evaluate a provided filter 
expression, it MUST fail
                 the query and return an error to the caller. The client MUST 
NOT return any rows
                 of the table when this happens.
   ```
   
   A matching clarification on projection rule 5 would close the analogous gap 
there.



##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -3265,6 +3265,133 @@ components:
           additionalProperties:
             type: string
 
+    ReadRestrictions:
+      type: object
+      description: >
+          Read restrictions for a table, including column projections and row 
filter expressions, according to the current schema.
+
+          A client MUST enforce the restrictions defined in this object when 
reading data
+          from the table.

Review Comment:
   +1 — the spec currently says `A client MUST enforce` without any 
qualification, which is unenforceable from the server side. Two paths that 
would tighten this:
   
   State explicitly that this is a normative requirement on **trusted** 
clients, with a sentence noting that trust establishment (mTLS, on-behalf-of 
OAuth, etc.) is out of scope for this spec and is a catalog-implementation 
concern. The PR description already takes this position; lifting it into the 
spec text would make the assumption visible to readers.
   
   Frame the field as advisory data accompanied by a separate normative 
requirement that catalogs MUST NOT include `ReadRestrictions` in responses to 
clients they do not trust to enforce them. This could also address the forward 
compatibility concern.
   
   An Iceberg client built before this spec change receives a `LoadTableResult` 
with an unknown `read-restrictions` field and silently ignores it (the standard 
"unknown fields are ignored" REST convention used elsewhere in this spec). On a 
security-critical feature, that's fail-open: the catalog has produced 
restrictions, the client returns the raw values, and the catalog has no way to 
detect this happened. This is a stronger problem — even a fully trusted client 
running an older version cannot enforce a contract it doesn't know about.
   
   Two options worth picking between in the spec:
   
   1. **Capability negotiation.** Define a request-side signal — a header 
(e.g., `Iceberg-Supported-Capabilities: read-restrictions`) or a query 
parameter — that the client uses to advertise support. Catalogs MUST NOT 
include `read-restrictions` in responses to clients that haven't advertised 
support, and instead MUST return 403 if the principal would have had 
restrictions. This puts enforcement on the side that can verify it.
   2. **Out-of-band trust.** State explicitly that catalogs MUST NOT return 
`read-restrictions` to clients whose enforcement they cannot verify 
out-of-band, and that establishing that verification is out of scope. Closer to 
the current PR-description framing, but makes the constraint normative.
   
   Without either, every catalog that adopts this is open to fail-open on stale 
clients.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to