RussellSpitzer commented on PR #13880:
URL: https://github.com/apache/iceberg/pull/13880#issuecomment-3215795421

   I couldn't figure out exactly what the memory leak is in our test suite 
that's causing an Issue but it seems like it's related to the task statuses 
never getting cleared from the Spark context during the 
TestRewriteDataFilesAction test suite. Because the suite now runs within an 
additional config, the number of tasks increased dramatically and I believe 
this was the base cause of the OOM.
   
   I tried disabling the UI but that didn't seem to help in any way, the 
statuses still stuck around. 
   
   So I decided to take a different tack and just optimize the test suite 
instead. The main thing I did is to go through and take all of the "Spark 
Sorts" and switch them to normal Java collection sorts. This has two outcomes, 
first the test suite runs much faster since we had adaptive shuffle disabled 
for this suite and it had to do 200 tasks per sort and because local sort is 
much faster than using the Spark mechanism. Second, the number of tasks is 
reduced dramatically which decreases the amount of "Task Status" objects that 
hang around.
   
   If this ends up still being an issue in the future we can either track down 
the status issue or move some these tests into a different test suite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to