dyzcs commented on issue #2900: URL: https://github.com/apache/iceberg/issues/2900#issuecomment-1996323592
Hi @ayush-san I think there are two possible reasons for OOM, one of which is as shown in the figure below. Therefore, according to the partition fields of the Iceberg table, perform Keyby operations on the data, and then write the data from the same partition in the same Subtask, so as to control the number of partitions that each Task Manager needs to write at the same time within a reasonable range to avoid OOM problems.  Another reason may be due to too many small or expired files, and following your above method should be able to solve it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org