RussellSpitzer commented on issue #6326:
URL: https://github.com/apache/iceberg/issues/6326#issuecomment-1333946715

   While I don't have a problem with disabling statistics reporting, I am 
pretty dubious this takes that long. What I believe you are actually seeing is 
the task list being created fort the first time and stored in a list. We use a 
lazy iterator which needs to be turned into a list before the job begins (even 
if statistics are not reported). This means even if we don't spend the time 
iterating the list when we are estimating stats, we will spend that same amount 
of time later when planning tasks. The only difference would be in the current 
case the second access to "tasks()" is cached so it's very fast.
   
   In this case the speed could probably be improved if the parallelism of the 
Manifest Reads was increased. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to