singhpk234 commented on code in PR #14824:
URL: https://github.com/apache/iceberg/pull/14824#discussion_r2616429267
##########
core/src/main/java/org/apache/iceberg/rest/ScanTaskIterable.java:
##########
@@ -240,10 +239,14 @@ public void close() {
}
private boolean isDone() {
- return taskQueue.isEmpty()
+ // Reorder the conditions to make sure TaskQueue is empty is checked
last.
+ // It may happen that a worker is about to add a new task to the queue,
but before
+ // that happens, taskQueue.isEmpty() is checked then it completes fast
before the
+ // activeWorker is decremented. This would lead to a false negative.
+ return activeWorkers.get() == 0
&& planTasks.isEmpty()
- && activeWorkers.get() == 0
- && initialFileScanTasks.isEmpty();
+ && initialFileScanTasks.isEmpty()
+ && taskQueue.isEmpty();
Review Comment:
> I wonder if we could simplify this by just counting the work like:
you mean to say producer is done ? but in this scenario we want all task
production to be done. so the order matters as if we need to check the producer
is completed producing and now if the taskQueue . This issue this pr fixes is
the same we do have a way check if there is active workers with the
`activeWorkers` count.
I think with the posion pill analogy like above we can achieve the end of
production, which we are iterating on together in the pr on my fork (you are
welcomed to join the discussion too) but there is edge case to that . let me
push Amogh's suggesting in this PR to see what we all think, appreciate your
thoughts on it !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]