stevenzwu commented on code in PR #16689:
URL: https://github.com/apache/iceberg/pull/16689#discussion_r3364254819
##########
core/src/main/java/org/apache/iceberg/Tracking.java:
##########
@@ -28,13 +28,13 @@ interface Tracking {
0,
"status",
Types.IntegerType.get(),
- "Entry status: 0=existing, 1=added, 2=deleted, 3=replaced");
+ "Entry status: 0=existing, 1=added, 2=deleted, 3=replaced,
4=modified");
Types.NestedField SNAPSHOT_ID =
Types.NestedField.optional(
1,
"snapshot_id",
Types.LongType.get(),
- "Snapshot ID where the file was added or deleted");
+ "Snapshot ID where the file was added, deleted, replaced, or
modified");
Review Comment:
`TrackingBuilder.terminal()` writes `newSnapshotId` for both DELETED and
REPLACED today, so the doc matches the current code.
To me, `deleted` and `replaced` entries should behave the same. replaced is
just a special flavor of deleted. Since v1-v3 already modify the snapshot ID
for a deleted entry, it seems reasonable to maintain the same behavior.
Also for change detection, it is probably better to update the snapshot id
in tracking for deleted or replaced entries. If the tracking snapshot_id
doesn't match the current snapshot id, the entries should be ignored for change
detection. This is important because the current spec is silent on lifetime of
deleted entries.
##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,13 @@ enum EntryStatus {
EXISTING(0),
ADDED(1),
DELETED(2),
- /** Indicates an entry that has been replaced by a column update or DV
change. Added in v4. */
- REPLACED(3);
+ /**
+ * Non-live entry recording that a prior file version was superseded by
another live entry. Added
+ * in v4.
+ */
+ REPLACED(3),
+ /** Live entry recording that the file was modified in this snapshot. Added
in v4. */
Review Comment:
Mirror the framing rdblue suggested for REPLACED above — MODIFIED is its
live counterpart.
```suggestion
/** The new (live) state of an entry that has been modified. Paired with
REPLACED. Added in v4. */
```
##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,13 @@ enum EntryStatus {
EXISTING(0),
ADDED(1),
DELETED(2),
- /** Indicates an entry that has been replaced by a column update or DV
change. Added in v4. */
- REPLACED(3);
+ /**
+ * Non-live entry recording that a prior file version was superseded by
another live entry. Added
+ * in v4.
Review Comment:
Is `old` better than `starting` in this conext?
> The old (replaced) state of en entry that is modified. Paired with
MODIFIED. Added in v4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]