hydrogenlee commented on PR #14080:
URL: https://github.com/apache/iceberg/pull/14080#issuecomment-3336834669

   > Could we just make sure that the state only restored on subtask 0?
   
   I don't think we can do that, because `globalStatisticsState` is a union 
list state. In Flink, it takes the states from all subtasks, merges them into 
one, and then sends that merged state back to every subtask.
   
   After repeating this run–stop–savepoint–restart cycle several times, the 
`globalStatisticsState` grows extremely large if we not only keep the state of 
subtask 0. The bigger problem is — during restore(see hot thread below), the 
job gets stuck before `initializeState` is called, so we never get the chance 
to only restore subtask 0.
   
   see detail: https://github.com/apache/iceberg/issues/14079
   <img width="2208" height="1314" alt="image" 
src="https://github.com/user-attachments/assets/163a637f-bcab-422e-91e1-1dfa2f240137";
 />
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to