thomas-pfeiffer commented on issue #2325: URL: https://github.com/apache/iceberg-python/issues/2325#issuecomment-3626362611
@andormarkus reg. your questions: > What led you to disable the cache entirely rather than clearing at invocation boundaries? I use the repro script from https://github.com/apache/iceberg-python/issues/2325#issuecomment-3221265037 and when I disabled the cache completely, the memory was fully stable even after 2.000 executions. I guess the stability was much more a concern since our individual Lambda executions run 2-3 mins, hence repeating failed executions caused by out of memory errors are worse for us then losing a few seconds on a not-cached manifest. (Not taking the remark from https://github.com/apache/iceberg-python/issues/2325#issuecomment-3625428629 into account.) > Did you try clearing at execution start/end and find it insufficient? Not really. Disabling the cache was the simpler solution. And since we leverage some other things we felt eradicating this memory leak source completely seemed the better choice for us. > Are your Lambda executions particularly long-running or processing very large manifest lists? Average duration was between 2-3 min in the last days; Depends a bit on the incoming data in our use case. Haven't checked regarding the manifest list tbh. > What memory allocation are you working with? We allocated `2048MB` to the AWS Lambda. On some days the usage goes up to `2022MB`, but it's not very consistent e.g., yesterday it was mostly around `~1400MB`. Depends highly on the incoming data for us, but no out of memory errors since the workaround so far. Small remark / question reg. your approach: > Init step - Clear cache at the beginning of Lambda execution > Post-execution step - Clear cache after completing the operation Doesn't that mean you clean your cache twice immediately one another? I would have assumed the post-execution cache cleaning to be enough. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
