[I] Support for projecting metadata columns _pos, _spec_id, and _partition in table scan [iceberg-rust]

via GitHub Mon, 08 Jun 2026 14:04:18 -0700


parthchandra opened a new issue, #2607:
URL: https://github.com/apache/iceberg-rust/issues/2607


   ### Is your feature request related to a problem or challenge?
   
   
   These columns are required for row-level operations (DELETE, UPDATE, MERGE) 
in both copy-on-write and merge-on-read strategies:
   
   | Column | CoW | MoR | Role |
   |--------|-----|-----|------|
   | `_file` | required | row identity | Identifies the source data file |
   | `_pos` | required | row identity | Identifies the row within the file |
   | `_spec_id` | -- | required | Partition spec ID for delta writes |
   | `_partition` | -- | required | Partition values for delta writes |
   
   Without these, query engines cannot implement row-level mutations against 
Iceberg tables via iceberg-rust.
   
   In iceberg-java, Spark's row-level operation API requires these columns:
   
   - **Copy-on-Write** 
(`SparkCopyOnWriteOperation.requiredMetadataAttributes()`): requests `_file` + 
`_pos` to identify which rows to rewrite during DELETE/UPDATE.
   - **Merge-on-Read** (`SparkPositionDeltaOperation`): uses `_file` + `_pos` 
as `rowId()` to uniquely identify rows, and requests `_spec_id` + `_partition` 
via `requiredMetadataAttributes()` so the delta writer knows which partition to 
write position delete files into.
   
   This pattern is consistent across iceberg-java v3.4 through v4.1. Any query 
engine (DataFusion Comet, etc.) building row-level operations on top of 
iceberg-rust will need these columns.
   
   
   
   ### Describe the solution you'd like
   
   ## 1. `_spec_id` metadata column
   
   **Description:** Constant per `FileScanTask` -- same pattern as `_file`.
   
   **Changes:**
   - Add `spec_id: i32` field to `FileScanTask` in `scan/task.rs`, populated 
from the manifest entry's `partition_spec_id` during scan planning
   - In `pipeline.rs`, inject as constant when projected:
     ```rust
     if task.project_field_ids().contains(&RESERVED_FIELD_ID_SPEC_ID) {
         let spec_id_datum = Datum::int(task.spec_id);
         builder = builder.with_constant(RESERVED_FIELD_ID_SPEC_ID, 
spec_id_datum);
     }
     ```
   - Add tests following the existing `_file` column test patterns
   
   ## 2. `_pos` metadata column
   
   **Description:** The ordinal row position (0-based) within the source data 
file. Unlike `_file` and `_spec_id`, this is NOT a constant -- it increases 
monotonically across batches within a file.
   
   **Changes:**
   - New `ColumnSource` variant in `RecordBatchTransformer` (e.g., 
`RowPosition`) that generates sequential `Int64Array` values
   - Mutable state in the transformer tracking the row offset across batches 
within a file. After each batch of N rows, `start_offset += N`.
   - Handle split reads: if `FileScanTask` reads a portion of a file, the 
initial position offset must account for rows before the split (from Parquet 
row group's row index offset)
   - In `pipeline.rs`, detect `RESERVED_FIELD_ID_POS` in projected fields and 
configure the transformer accordingly
   - Must use the same 0-based numbering semantics as positional delete files
   
   **Design considerations:**
   - iceberg-rust currently handles positional deletes via a separate 
`DeleteVector`/`RowSelection` pre-filtering mechanism. The `_pos` column is 
architecturally independent but must agree on numbering.
   - In Java, `PositionVectorReader` gets `rowStart` from 
`PageReadStore.getRowIndexOffset()` per row group, then fills `[rowStart, 
rowStart+1, ..., rowStart+N-1]` per batch.
   
   ## 3. `_partition` metadata column
   
   **Description:** A struct column whose type is the union of all partition 
fields across all partition specs in the table (to handle partition evolution). 
Each row gets the partition values for its data file, with nulls for fields 
from other specs.
   
   **Changes:**
   
   1. **Compute the table-level partition type** as a union struct of all 
partition fields across all specs. Equivalent to Java's 
`Partitioning.partitionType(table)`. The function `partition_field()` in 
`metadata_columns.rs` already constructs a struct from partition fields -- may 
need a helper that collects fields across all specs.
   
   2. **Propagate the unified partition type to `FileScanTask`** so each task 
knows the full struct schema and which fields to null-fill for its specific 
spec.
   
   3. **Materialize as a constant `StructArray` per batch:**
      - Fields present in this file's partition spec get their values from 
partition data
      - Fields from other specs (partition evolution) get null
      - The struct is repeated (constant) for all rows in the batch
   
   4. **Handle type coercion:** Partition values may need coercion to match the 
canonical partition type (Java uses `StructProjection`).
   
   5. **Extend `RecordBatchTransformer`** to support struct-typed constants. 
Currently `create_primitive_array_repeated` only handles primitives -- need 
equivalent struct array construction.
   
   
   
   ### Willingness to contribute
   
   I can contribute to this feature independently


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Support for projecting metadata columns _pos, _spec_id, and _partition in table scan [iceberg-rust]

Reply via email to