twuebi commented on PR #638:
URL: https://github.com/apache/iceberg-go/pull/638#issuecomment-3597385204
Yes, @zeroshade, I must have misunderstood iceberg-java's updateSnapshot
logic, and it seems that iceberg-rust may have a bug there, this is java
source, there's no clearing snapshot log on removeRef, only upon
`MetadataUpdate.RemoveSnapshot`, dangling references are purged from the
snapshotlog
```java
public Builder removeRef(String name) {
if (SnapshotRef.MAIN_BRANCH.equals(name)) {
this.currentSnapshotId = -1;
}
SnapshotRef ref = refs.remove(name);
if (ref != null) {
changes.add(new MetadataUpdate.RemoveSnapshotRef(name));
}
return this;
}
```
```java
private static List<HistoryEntry> updateSnapshotLog(
List<HistoryEntry> snapshotLog,
Map<Long, Snapshot> snapshotsById,
long currentSnapshotId,
List<MetadataUpdate> changes) {
Set<Long> intermediateSnapshotIds = intermediateSnapshotIdSet(changes,
currentSnapshotId);
boolean hasIntermediateSnapshots = !intermediateSnapshotIds.isEmpty();
boolean hasRemovedSnapshots =
changes.stream()
.anyMatch(
change ->
change instanceof MetadataUpdate.RemoveSnapshots
|| change instanceof
MetadataUpdate.RemoveSnapshot);
if (!hasIntermediateSnapshots && !hasRemovedSnapshots) {
return snapshotLog;
}
// update the snapshot log
List<HistoryEntry> newSnapshotLog = Lists.newArrayList();
for (HistoryEntry logEntry : snapshotLog) {
long snapshotId = logEntry.snapshotId();
if (snapshotsById.containsKey(snapshotId)) {
if (!intermediateSnapshotIds.contains(snapshotId)) {
// copy the log entries that are still valid
newSnapshotLog.add(logEntry);
}
} else if (hasRemovedSnapshots) {
// any invalid entry causes the history before it to be removed.
otherwise, there could be
// history gaps that cause time-travel queries to produce
incorrect results. for example,
// if history is [(t1, s1), (t2, s2), (t3, s3)] and s2 is removed,
the history cannot be
// [(t1, s1), (t3, s3)] because it appears that s3 was current
during the time between t2
// and t3 when in fact s2 was the current snapshot.
newSnapshotLog.clear();
}
}
if (snapshotsById.get(currentSnapshotId) != null) {
ValidationException.check(
Iterables.getLast(newSnapshotLog).snapshotId() ==
currentSnapshotId,
"Cannot set invalid snapshot log: latest entry is not the
current snapshot");
}
return newSnapshotLog;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]