Fokko commented on code in PR #787: URL: https://github.com/apache/iceberg-python/pull/787#discussion_r1632224368
########## pyiceberg/table/snapshots.py: ########## @@ -247,12 +248,19 @@ def __str__(self) -> str: result_str = f"{operation}id={self.snapshot_id}{parent_id}{schema_id}" return result_str - def manifests(self, io: FileIO) -> List[ManifestFile]: - if self.manifest_list is not None: - file = io.new_input(self.manifest_list) + @staticmethod + @lru_cache + def _manifests(io: FileIO, manifest_list: Optional[str]) -> List[ManifestFile]: Review Comment: I don't think this is the best place to add the caching. With the newly introduced delete operation we can produce multiple snapshots. Each snapshot will produce an unique manifest list, but within the list there is a fair chance that we point to the same manifests. It would be better to move the caching to the module level, so we can cache across manifest-lists. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org