Fokko commented on code in PR #787:
URL: https://github.com/apache/iceberg-python/pull/787#discussion_r1632224368


##########
pyiceberg/table/snapshots.py:
##########
@@ -247,12 +248,19 @@ def __str__(self) -> str:
         result_str = f"{operation}id={self.snapshot_id}{parent_id}{schema_id}"
         return result_str
 
-    def manifests(self, io: FileIO) -> List[ManifestFile]:
-        if self.manifest_list is not None:
-            file = io.new_input(self.manifest_list)
+    @staticmethod
+    @lru_cache
+    def _manifests(io: FileIO, manifest_list: Optional[str]) -> 
List[ManifestFile]:

Review Comment:
   I don't think this is the best place to add the caching.
   
   With the newly introduced delete operation we can produce multiple 
snapshots. Each snapshot will produce an unique manifest list, but within the 
list there is a fair chance that we point to the same manifests.
   
   It would be better to move the caching to the module level, so we can cache 
across manifest-lists.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to