kevinjqliu commented on code in PR #2951:
URL: https://github.com/apache/iceberg-python/pull/2951#discussion_r2729892624
##########
pyiceberg/manifest.py:
##########
@@ -892,15 +891,53 @@ def __hash__(self) -> int:
return hash(self.manifest_path)
-# Global cache for manifest lists
-_manifest_cache: LRUCache[Any, tuple[ManifestFile, ...]] =
LRUCache(maxsize=128)
+# Global cache for ManifestFile objects, keyed by manifest_path.
+# This deduplicates ManifestFile objects across manifest lists, which commonly
+# share manifests after append operations.
+_manifest_cache: LRUCache[str, ManifestFile] = LRUCache(maxsize=512)
Review Comment:
good catch. now that we're only caching ManifestFile objects, they have
relatively small memory footprint. we were catching manifest list before, each
pointing to many many ManifestFiles
also https://github.com/apache/iceberg-python/issues/2952 should make this
configurable
##########
pyiceberg/manifest.py:
##########
@@ -892,15 +891,53 @@ def __hash__(self) -> int:
return hash(self.manifest_path)
-# Global cache for manifest lists
-_manifest_cache: LRUCache[Any, tuple[ManifestFile, ...]] =
LRUCache(maxsize=128)
+# Global cache for ManifestFile objects, keyed by manifest_path.
+# This deduplicates ManifestFile objects across manifest lists, which commonly
+# share manifests after append operations.
+_manifest_cache: LRUCache[str, ManifestFile] = LRUCache(maxsize=512)
Review Comment:
good catch. now that we're only caching ManifestFile objects, they have
relatively small memory footprint. we were catching manifest list before, each
pointing to many many ManifestFiles
also https://github.com/apache/iceberg-python/issues/2952 should make this
configurable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]