parthchandra opened a new issue, #2607:
URL: https://github.com/apache/iceberg-rust/issues/2607
### Is your feature request related to a problem or challenge?
These columns are required for row-level operations (DELETE, UPDATE, MERGE)
in both copy-on-write and merge-on-read strategies:
| Column | CoW | MoR | Role |
|--------|-----|-----|------|
| `_file` | required | row identity | Identifies the source data file |
| `_pos` | required | row identity | Identifies the row within the file |
| `_spec_id` | -- | required | Partition spec ID for delta writes |
| `_partition` | -- | required | Partition values for delta writes |
Without these, query engines cannot implement row-level mutations against
Iceberg tables via iceberg-rust.
In iceberg-java, Spark's row-level operation API requires these columns:
- **Copy-on-Write**
(`SparkCopyOnWriteOperation.requiredMetadataAttributes()`): requests `_file` +
`_pos` to identify which rows to rewrite during DELETE/UPDATE.
- **Merge-on-Read** (`SparkPositionDeltaOperation`): uses `_file` + `_pos`
as `rowId()` to uniquely identify rows, and requests `_spec_id` + `_partition`
via `requiredMetadataAttributes()` so the delta writer knows which partition to
write position delete files into.
This pattern is consistent across iceberg-java v3.4 through v4.1. Any query
engine (DataFusion Comet, etc.) building row-level operations on top of
iceberg-rust will need these columns.
### Describe the solution you'd like
## 1. `_spec_id` metadata column
**Description:** Constant per `FileScanTask` -- same pattern as `_file`.
**Changes:**
- Add `spec_id: i32` field to `FileScanTask` in `scan/task.rs`, populated
from the manifest entry's `partition_spec_id` during scan planning
- In `pipeline.rs`, inject as constant when projected:
```rust
if task.project_field_ids().contains(&RESERVED_FIELD_ID_SPEC_ID) {
let spec_id_datum = Datum::int(task.spec_id);
builder = builder.with_constant(RESERVED_FIELD_ID_SPEC_ID,
spec_id_datum);
}
```
- Add tests following the existing `_file` column test patterns
## 2. `_pos` metadata column
**Description:** The ordinal row position (0-based) within the source data
file. Unlike `_file` and `_spec_id`, this is NOT a constant -- it increases
monotonically across batches within a file.
**Changes:**
- New `ColumnSource` variant in `RecordBatchTransformer` (e.g.,
`RowPosition`) that generates sequential `Int64Array` values
- Mutable state in the transformer tracking the row offset across batches
within a file. After each batch of N rows, `start_offset += N`.
- Handle split reads: if `FileScanTask` reads a portion of a file, the
initial position offset must account for rows before the split (from Parquet
row group's row index offset)
- In `pipeline.rs`, detect `RESERVED_FIELD_ID_POS` in projected fields and
configure the transformer accordingly
- Must use the same 0-based numbering semantics as positional delete files
**Design considerations:**
- iceberg-rust currently handles positional deletes via a separate
`DeleteVector`/`RowSelection` pre-filtering mechanism. The `_pos` column is
architecturally independent but must agree on numbering.
- In Java, `PositionVectorReader` gets `rowStart` from
`PageReadStore.getRowIndexOffset()` per row group, then fills `[rowStart,
rowStart+1, ..., rowStart+N-1]` per batch.
## 3. `_partition` metadata column
**Description:** A struct column whose type is the union of all partition
fields across all partition specs in the table (to handle partition evolution).
Each row gets the partition values for its data file, with nulls for fields
from other specs.
**Changes:**
1. **Compute the table-level partition type** as a union struct of all
partition fields across all specs. Equivalent to Java's
`Partitioning.partitionType(table)`. The function `partition_field()` in
`metadata_columns.rs` already constructs a struct from partition fields -- may
need a helper that collects fields across all specs.
2. **Propagate the unified partition type to `FileScanTask`** so each task
knows the full struct schema and which fields to null-fill for its specific
spec.
3. **Materialize as a constant `StructArray` per batch:**
- Fields present in this file's partition spec get their values from
partition data
- Fields from other specs (partition evolution) get null
- The struct is repeated (constant) for all rows in the batch
4. **Handle type coercion:** Partition values may need coercion to match the
canonical partition type (Java uses `StructProjection`).
5. **Extend `RecordBatchTransformer`** to support struct-typed constants.
Currently `create_primitive_array_repeated` only handles primitives -- need
equivalent struct array construction.
### Willingness to contribute
I can contribute to this feature independently
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]