amogh-jahagirdar commented on issue #13737: URL: https://github.com/apache/iceberg/issues/13737#issuecomment-3156985575
If I'm not mistaken these logs look to be from the executor(s) in the cluster. With expire snapshots there are primarily 2 potentially memory consumption heavy parts: 1.) Determining reachable file sets to cleanup, this effectively performs an anti-join between files that exist before expiration and files that exist after expiration. So if your table has a lot of files, performing this join can be expensive and can OOM on executors. Your table has 250GB worth of data so there really shouldn't be too many files (~2k assuming 128mb files). Assuming worst case of 1024 byte path lengths, this is ~0.002GB worth of paths in memory being joined. Even with a suboptimal joining, it seems like this shouldn't explode. My math may be off though, it could be worth playing around with executor memory here. 2.) Collecting all the expired files to the driver for deletion. This is potentially memory heavy on the driver. But given the math above, for your case it doesn't seem like it should even be that much (and besides it doesn't look like the logs are from there and the option for limiting memory consumption here is to stream results back which already looks to be set for your job). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
