dttung2905 commented on code in PR #735:
URL: https://github.com/apache/iceberg-go/pull/735#discussion_r2885569768
##########
table/arrow_scanner.go:
##########
@@ -390,6 +390,86 @@ func (as *arrowScan) getRecordFilter(ctx context.Context,
fileSchema *iceberg.Sc
return nil, false, nil
}
+// synthesizeRowLineageColumns fills _row_id and _last_updated_sequence_number
from task constants
+// when those columns are present in the batch (e.g. from ToRequestedSchema).
Per the Iceberg v3
+// row lineage spec: if the value is null in the file, it is inherited
(synthesized) from the file's
+// first_row_id and data_sequence_number; otherwise the value from the file is
kept.
+// rowOffset is the 0-based row index within the current file and is updated
so _row_id stays
+// correct across multiple batches from the same file (first_row_id +
row_position).
+func synthesizeRowLineageColumns(
Review Comment:
Thanks for pointing out this . I see that the PR 762 has been approved and
waiting to be merged. Let me know once it lands in `main` so that I can rebase
and apply the fix for this PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]