sungwy commented on issue #2325: URL: https://github.com/apache/iceberg-python/issues/2325#issuecomment-3191532447
Hi @Declow — thanks for reporting this issue. If your observation is correct, it’s definitely a serious matter, and we should address it promptly. I appreciate your initiative in digging into the root cause. >/Users/dits/git/play-recommendation-input-consumer/.venv/lib/python3.11/site-packages/pyiceberg/avro/reader.py:322: size=1464 B, count=23, average=64 B /Users/dits/git/play-recommendation-input-consumer/.venv/lib/python3.11/site-packages/pyiceberg/avro/reader.py:372: size=728 B, count=13, average=56 B /Users/dits/git/play-recommendation-input-consumer/.venv/lib/python3.11/site-packages/pyiceberg/avro/reader.py:295: size=620 B, count=1, average=620 B /Users/dits/git/play-recommendation-input-consumer/.venv/lib/python3.11/site-packages/pyiceberg/avro/reader.py:453: size=590 B, count=1, average=590 B Even with the cache set to `1`, it still stores a tuple of `ManifestFile` objects, which grows with each new commit. Since each `ManifestFile` has the base overhead of a Python object instance plus some attributes (e.g., file path, partition), an increase of around ~100 B to ~500 B per commit seems reasonable. Did you notice any memory usage growth beyond this range that might indicate a more substantial leak? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
