hydrogenlee commented on PR #14080:
URL: https://github.com/apache/iceberg/pull/14080#issuecomment-3342759027
> Another question: any idea how to test this? It would be good to add a
test, so we could prevent the issue recurring
I originally added the check(see below) in `initializeState` function to
make sure there’s only one `GlobalStatistics` in state, but removed it before
submitting the PR because it might break backward compatibility. The old
version’s `globalStatisticsState` contains multiple states, after upgrading
iceberg connector, which may cause exception and prevent the job from starting.
Adding a check like this would help prevent the issue and we could test it
in unit tests, but I’m not sure if we can change it that way. Do you have any
suggestions?
```
if (context.isRestored()) {
List<GlobalStatistics> globalStatisticsList = new ArrayList<>();
IterableUtils.emptyIfNull(globalStatisticsState.get()).forEach(globalStatisticsList::add);
if (CollectionUtils.isEmpty(globalStatisticsList)) {
LOG.info(
"Operator {} subtask {} doesn't have global statistics state to
restore",
operatorName,
subtaskIndex);
// If Flink deprecates union state in the future,
RequestGlobalStatisticsEvent can be
// leveraged to request global statistics from coordinator if new
subtasks (scale-up case)
// has nothing to restore from.
} else {
if (globalStatisticsList.size() > 1) {
throw new IllegalStateException("There should be at most one
global stats written by the first subtask");
}
GlobalStatistics restoredStatistics = globalStatisticsList.get(0);
LOG.info(
"Operator {} subtask {} restored global statistics state",
operatorName, subtaskIndex);
this.globalStatistics = restoredStatistics;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]