kevinjqliu commented on issue #1952: URL: https://github.com/apache/iceberg-python/issues/1952#issuecomment-2851881539
Interesting, so [`self.exists()`](https://github.com/apache/iceberg-python/blob/34c89494c39916b9b1aa7e6da2c24c34c4d7f058/pyiceberg/io/pyarrow.py#L344) here returns`True`. Which means `self._file_info()` [returned an object and did not error](https://github.com/apache/iceberg-python/blob/34c89494c39916b9b1aa7e6da2c24c34c4d7f058/pyiceberg/io/pyarrow.py#L288) It'll be helpful to log that return value. it should return with type `FileType.NotFound` and [raise `FileNotFoundError`](https://github.com/apache/iceberg-python/blob/34c89494c39916b9b1aa7e6da2c24c34c4d7f058/pyiceberg/io/pyarrow.py#L276-L277) According to the [docs](https://arrow.apache.org/docs/python/generated/pyarrow.fs.GcsFileSystem.html#pyarrow.fs.GcsFileSystem.get_file_info) > A non-existing or unreachable file returns a FileStat object and has a FileType of value NotFound. the "unreachable file" part is interesting. Also looking at the code, GCS is doing something with the [not found case here](https://github.com/apache/arrow/blob/067fd2a2c6e54d33b9ae8a3324f59bebe960d485/cpp/src/arrow/filesystem/gcsfs.cc#L351-L360) It'll be helpful for debugging to see what the return value of `self._file_info()` is -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org