jayceslesar commented on code in PR #1958:
URL: https://github.com/apache/iceberg-python/pull/1958#discussion_r2072448127


##########
pyiceberg/table/inspect.py:
##########
@@ -657,3 +665,37 @@ def all_manifests(self) -> "pa.Table":
             lambda args: self._generate_manifests_table(*args), [(snapshot, 
True) for snapshot in snapshots]
         )
         return pa.concat_tables(manifests_by_snapshots)
+
+    def orphaned_files(self, location: str, older_than: Optional[timedelta] = 
timedelta(days=3)) -> Set[str]:
+        try:
+            import pyarrow as pa  # noqa: F401
+        except ModuleNotFoundError as e:
+            raise ModuleNotFoundError("For deleting orphaned files PyArrow 
needs to be installed") from e
+
+        from pyarrow.fs import FileSelector, FileType
+
+        from pyiceberg.io.pyarrow import _fs_from_file_path
+
+        all_known_files = set()

Review Comment:
   Okay, let me know what you think about the change I just pushed -- see 
`all_known_files`. @Fokko vis as well -- this should make testing a lot easier 
(if I have both of your blessings here I will add tests for this function) and 
allow us to modify smarter going forward



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to