stevenzwu commented on code in PR #13827:
URL: https://github.com/apache/iceberg/pull/13827#discussion_r2279712988
##########
flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestDataStatisticsCoordinator.java:
##########
@@ -279,6 +279,57 @@ public void testRequestGlobalStatisticsEventHandling()
throws Exception {
}
}
+ @Test
+ public void testMultipleRequestGlobalStatisticsEvents() throws Exception {
+ try (DataStatisticsCoordinator dataStatisticsCoordinator =
+ createCoordinator(StatisticsType.Map)) {
+ dataStatisticsCoordinator.start();
+ tasksReady(dataStatisticsCoordinator);
+
+ StatisticsEvent checkpoint1Subtask0DataStatisticEvent =
+ Fixtures.createStatisticsEvent(
+ StatisticsType.Map, Fixtures.TASK_STATISTICS_SERIALIZER, 1L,
CHAR_KEYS.get("a"));
+ StatisticsEvent checkpoint1Subtask1DataStatisticEvent =
+ Fixtures.createStatisticsEvent(
+ StatisticsType.Map, Fixtures.TASK_STATISTICS_SERIALIZER, 1L,
CHAR_KEYS.get("b"));
+
+ dataStatisticsCoordinator.handleEventFromOperator(
+ 0, 0, checkpoint1Subtask0DataStatisticEvent);
+ dataStatisticsCoordinator.handleEventFromOperator(
+ 1, 0, checkpoint1Subtask1DataStatisticEvent);
+
+ waitForCoordinatorToProcessActions(dataStatisticsCoordinator);
+
+ // signature is null
+ dataStatisticsCoordinator.handleEventFromOperator(0, 0, new
RequestGlobalStatisticsEvent());
+
+ // Checkpoint StatisticEvent + RequestGlobalStatisticsEvent
+ Awaitility.await("wait for first statistics event")
+ .pollInterval(Duration.ofMillis(10))
+ .atMost(Duration.ofSeconds(10))
+ .until(() -> receivingTasks.getSentEventsForSubtask(0).size() == 2);
+
+ // signature is right
Review Comment:
nit: maybe the comment can be changed to following?
```
Simulate the scenario where a subtask send global statistics request with
the same hash code. The coordinator would skip the response after comparing the
request contained hash code with latest global statistics hash code.
```
##########
flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestDataStatisticsCoordinator.java:
##########
@@ -279,6 +279,57 @@ public void testRequestGlobalStatisticsEventHandling()
throws Exception {
}
}
+ @Test
+ public void testMultipleRequestGlobalStatisticsEvents() throws Exception {
+ try (DataStatisticsCoordinator dataStatisticsCoordinator =
+ createCoordinator(StatisticsType.Map)) {
+ dataStatisticsCoordinator.start();
+ tasksReady(dataStatisticsCoordinator);
+
+ StatisticsEvent checkpoint1Subtask0DataStatisticEvent =
+ Fixtures.createStatisticsEvent(
+ StatisticsType.Map, Fixtures.TASK_STATISTICS_SERIALIZER, 1L,
CHAR_KEYS.get("a"));
+ StatisticsEvent checkpoint1Subtask1DataStatisticEvent =
+ Fixtures.createStatisticsEvent(
+ StatisticsType.Map, Fixtures.TASK_STATISTICS_SERIALIZER, 1L,
CHAR_KEYS.get("b"));
+
+ dataStatisticsCoordinator.handleEventFromOperator(
+ 0, 0, checkpoint1Subtask0DataStatisticEvent);
+ dataStatisticsCoordinator.handleEventFromOperator(
+ 1, 0, checkpoint1Subtask1DataStatisticEvent);
+
+ waitForCoordinatorToProcessActions(dataStatisticsCoordinator);
+
+ // signature is null
+ dataStatisticsCoordinator.handleEventFromOperator(0, 0, new
RequestGlobalStatisticsEvent());
+
+ // Checkpoint StatisticEvent + RequestGlobalStatisticsEvent
+ Awaitility.await("wait for first statistics event")
+ .pollInterval(Duration.ofMillis(10))
+ .atMost(Duration.ofSeconds(10))
+ .until(() -> receivingTasks.getSentEventsForSubtask(0).size() == 2);
+
+ // signature is right
+ int correctSignature =
dataStatisticsCoordinator.globalStatistics().hashCode();
+ dataStatisticsCoordinator.handleEventFromOperator(
+ 0, 0, new RequestGlobalStatisticsEvent(correctSignature));
+
+ Thread.sleep(200);
Review Comment:
I know we are waiting for 200 ms to confirm no response is sent in this
case. We can probably replace the sleep with
`waitForCoordinatorToProcessActions` then assert the sent events immediately.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]