huaxingao commented on code in PR #14824:
URL: https://github.com/apache/iceberg/pull/14824#discussion_r2616682144
##########
core/src/main/java/org/apache/iceberg/rest/ScanTaskIterable.java:
##########
@@ -137,16 +131,47 @@ public void run() {
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
+ failure.compareAndSet(null, new RuntimeException("PlanWorker was
interrupted", e));
+ shutdown.set(true);
} catch (Exception e) {
- throw new RuntimeException("Worker failed processing planTask", e);
+ failure.compareAndSet(null, new RuntimeException("Worker failed
processing planTask", e));
+ shutdown.set(true);
} finally {
- int remaining = activeWorkers.decrementAndGet();
+ handleWorkerExit();
+ }
+ }
+
+ private void handleWorkerExit() {
+ boolean isLastWorker = activeWorkers.decrementAndGet() == 0;
+ boolean hasWorkLeft = !planTasks.isEmpty() ||
!initialFileScanTasks.isEmpty();
+ boolean isShuttingDown = shutdown.get();
+
+ if (isLastWorker && (!hasWorkLeft || isShuttingDown)) {
+ signalCompletion();
+ } else if (isLastWorker && hasWorkLeft) {
+ failure.compareAndSet(
+ null,
+ new IllegalStateException("Workers have exited but there is still
work to be done"));
+ shutdown.set(true);
+ }
+ }
+
+ private void signalCompletion() {
+ try {
+ taskQueue.put(DUMMY_TASK);
Review Comment:
Nit: taskQueue.put(...) can block indefinitely on a full bounded queue if
the consumer stops draining (e.g., failure causes hasNext() to throw / consumer
aborts early). Maybe switch to offer(timeout) (and similarly for other
taskQueue.put(...) sites) to avoid worker hangs?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]